Now I got an idea how to asyncify the ISO core standard I/O.--- Synchronet 3.20a-Linux NewsLink 1.114
The idea is very simple and I try to solve one problem: How
can we make for example get_code/2 asyncified, without
sacrificing performance? I came up with this sketch:
get_code(S, C) :- get_code_buffered(S, C), !.
get_code(S, C) :- async_read_buffer(S), get_code(S, C).
There are some unresolved issues in the above code, like
how return end of file, i.e. -1 or how to deal with surrogate pairs.
But essentially the idea is that the stream S has a buffer
somewhere and that the fast path is to read from this buffer,
and we only need to yield when we replenish the buffer via
some I/O. This could give a quite fast asyncified get_code/2.
Is there any Prolog system that did already something like this?
I always see that get_code/2 is a primitive and was also folllowing
this approach in my systems so far. In the above I bootstrap it from two
other primitives, a synchronous one and a asynchronous one,
make it itself not anymore primitive. get_code_buffered/2 would
fail if it has reached the end of the buffer.
Now I have a first prototype running:--- Synchronet 3.20a-Linux NewsLink 1.114
$ node dogelog.mjs
?- create_task((between(1,10,_), write(foo), nl, sleep(100), fail; true)). true.
?- foo
foo
foo
foo
foo
foo
foo
foo
foo
foo
The task is running, because the console read is now async.
If it were not async and blocking, the task wouldn't continue running.
Mostowski Collapse schrieb am Mittwoch, 10. August 2022 um 09:28:41 UTC+2:
Now I got an idea how to asyncify the ISO core standard I/O.
The idea is very simple and I try to solve one problem: How
can we make for example get_code/2 asyncified, without
sacrificing performance? I came up with this sketch:
get_code(S, C) :- get_code_buffered(S, C), !.
get_code(S, C) :- async_read_buffer(S), get_code(S, C).
There are some unresolved issues in the above code, like
how return end of file, i.e. -1 or how to deal with surrogate pairs.
But essentially the idea is that the stream S has a buffer
somewhere and that the fast path is to read from this buffer,
and we only need to yield when we replenish the buffer via
some I/O. This could give a quite fast asyncified get_code/2.
Is there any Prolog system that did already something like this?
I always see that get_code/2 is a primitive and was also folllowing
this approach in my systems so far. In the above I bootstrap it from two
other primitives, a synchronous one and a asynchronous one,
make it itself not anymore primitive. get_code_buffered/2 would
fail if it has reached the end of the buffer.
Whats also working is async file I/O with node fs:--- Synchronet 3.20a-Linux NewsLink 1.114
Showing the promises (i.e. open_promise and file_read_promise)
that get fullfilled in the course of action:
?- open('foo.txt', read, S), get_atom(S,-1,A), close(S).
open_promise
file_read_promise
file_read_promise
S = 0rReference, A = 'Hello World!'.
This was not in the request for comments sketch that open
itself is also asynchronous. Astonishingly error handling
works also, as a side effect from interrupt signals I already
developed in the spirit of the JavaScript AbortController.
So I am just using the abort reason field to give the error back:
?- open('foo2.txt', read, S), get_atom(S,-1,A), close(S).
open_promise
Fehler: Datei 'foo2.txt' nicht gefunden.
user auf 5
This is a little bit ugly since it means I am using only 25% of
the JavaScript Promise features, i.e. resolve() the other 50%
and 25% are not using reject() and not using reolve with return
value. What I am using to return values is the stream handle
itself as a value holder. Maybe I can remove sometime in the future
this uglyness. Ideally a concept of arbitrary FFI async predicates
would be swell. Now its a little bit a hack. Next step asyncify
fetch() in the browser. So as to get rid of my current use of
synchronized XmlHttpRequest, which rightfully causes a warning
in the browser saying it might block.
Mild Shock schrieb am Donnerstag, 1. Februar 2024 um 23:01:46 UTC+1:
Now I have a first prototype running:
$ node dogelog.mjs
?- create_task((between(1,10,_), write(foo), nl, sleep(100), fail; true)). true.
?- foo
foo
foo
foo
foo
foo
foo
foo
foo
foo
The task is running, because the console read is now async.
If it were not async and blocking, the task wouldn't continue running.
Mostowski Collapse schrieb am Mittwoch, 10. August 2022 um 09:28:41 UTC+2:
Now I got an idea how to asyncify the ISO core standard I/O.
The idea is very simple and I try to solve one problem: How
can we make for example get_code/2 asyncified, without
sacrificing performance? I came up with this sketch:
get_code(S, C) :- get_code_buffered(S, C), !.
get_code(S, C) :- async_read_buffer(S), get_code(S, C).
There are some unresolved issues in the above code, like
how return end of file, i.e. -1 or how to deal with surrogate pairs.
But essentially the idea is that the stream S has a buffer
somewhere and that the fast path is to read from this buffer,
and we only need to yield when we replenish the buffer via
some I/O. This could give a quite fast asyncified get_code/2.
Is there any Prolog system that did already something like this?
I always see that get_code/2 is a primitive and was also folllowing
this approach in my systems so far. In the above I bootstrap it from two
other primitives, a synchronous one and a asynchronous one,
make it itself not anymore primitive. get_code_buffered/2 would
fail if it has reached the end of the buffer.
Now that we have implemented Async I/O for Dogelog Player,--- Synchronet 3.20a-Linux NewsLink 1.114
we braught some of the principles already also to formerly
Jekejeke Prolog. Like we could get rid of compression.
But the venture has multiple goals:
- Goal 1: Async I/O, decompressed makes it simpler.
- Goal 2: Novacore, have a smaller core.
So what about Goal 2? One idea behind Goal 2 is to have
a single interface for text and binary streams. Even maybe
allow changing the encoding midflight.
This can be helpful for example in MIME multipart/mixed writing
or reading. So we have to get rid of some Java-ism and ISO-ism:
- ISO-ism: Have separate get_byte/[1,2] and get_code/[1,2],
we have to get rid of that. We are planning to realize binary streams
by an encoding such as "8bit" or "latin1" not yet sure how this
works out. Also the encoding should be mutable.
- Java-ism: In Java starting with BufferedReader and on top
LineNumberReader line sparators such as '\n', '\r' and '\r\n'
area compressed into '\n'. We have also to get rid of that if want
to processs binary streams with the text API.
There is no encoding parameter yet in Dogelog Player, its all
hardcoded "utf8". And formerly Jekejeke Prolog has an
encoding parameter but its immutable. So we will work on this
as well.
Getting rid of the Java-ism of compression was quite fun. We--- Synchronet 3.20a-Linux NewsLink 1.114
tried to push it to its limits. So the tokenizer written 100%
in Prolog now preserves '\n', '\r' and '\r\n'. But then when we for
example generate HTML we have to replace the line terminator
by </div></div>. So how to do this without falling back to an
atom_split/2 with separator '\n' and do it correctly?
Its a litte bit tricky, like for example an input such as 'abc\r'
should have a line count of 2. So anything that views '\r' as
padding will go wrong. The following works fine:
/**
* sys_split_lines(L, I, O):
* The predicate succeeds in L with the lines
* of the input I and output O codes.
*/
% sys_split_lines(-List, +List, -List)
sys_split_lines([A|L]) -->
sys_split_line(X), {atom_codes(A,X)},
sys_split_more(L).
% sys_split_more(-List, +List, -List)
sys_split_more([A|L]) --> sys_convert_sep, !,
sys_split_line(X), {atom_codes(A,X)},
sys_split_more(L).
sys_split_more([]) --> [].
% sys_split_line(-List, +List, -List)
sys_split_line([X|L]) --> \+ sys_convert_sep, [X], !,
sys_split_line(L).
sys_split_line([]) --> [].
The above uses DCG (\+)/1 (% 7.14.11) banned (sic!) by Scryer Prolog.
Where the line separators are kind of plugable, currently defined as follows,
but can be an arbitrary set of arbitrary long code combinations:
% sys_convert_sep(+List, -List)
sys_convert_sep --> [0'\r, 0'\n].
sys_convert_sep --> [0'\n].
sys_convert_sep --> [0'\r].
BTW: I think SWI-Prolog already implements some of the ideas
like encoding switching, which we do not have a demonstrator
for yet. But its a little bit weak and stubborn concerning line
terminators refuses to support CRLF.
Mild Shock schrieb am Mittwoch, 14. Februar 2024 um 12:38:56 UTC+1:
Now that we have implemented Async I/O for Dogelog Player,
we braught some of the principles already also to formerly
Jekejeke Prolog. Like we could get rid of compression.
But the venture has multiple goals:
- Goal 1: Async I/O, decompressed makes it simpler.
- Goal 2: Novacore, have a smaller core.
So what about Goal 2? One idea behind Goal 2 is to have
a single interface for text and binary streams. Even maybe
allow changing the encoding midflight.
This can be helpful for example in MIME multipart/mixed writing
or reading. So we have to get rid of some Java-ism and ISO-ism:
- ISO-ism: Have separate get_byte/[1,2] and get_code/[1,2],
we have to get rid of that. We are planning to realize binary streams
by an encoding such as "8bit" or "latin1" not yet sure how this
works out. Also the encoding should be mutable.
- Java-ism: In Java starting with BufferedReader and on top LineNumberReader line sparators such as '\n', '\r' and '\r\n'
area compressed into '\n'. We have also to get rid of that if want
to processs binary streams with the text API.
There is no encoding parameter yet in Dogelog Player, its all
hardcoded "utf8". And formerly Jekejeke Prolog has an
encoding parameter but its immutable. So we will work on this
as well.
Also Scryer Prolog and Trealla Prolog might have shoot--- Synchronet 3.20a-Linux NewsLink 1.114
themselves into the foot by usually prefering chars over codes.
This somehow makes a greater divide between text and binary,
but for practical purposes such as MIME multipart/mixed
we have to work the other way around, closing the abyss between
text and binary. Just make it look the same. According to
this principle:
KISS, an acronym for "Keep it simple, stupid!" https://en.wikipedia.org/wiki/KISS_principle
Just how these things such as MIME multipart/mixed possibly
evolved. People had mostlike switchable encoders and binary
streams, and not a wall between text and binary.
Not sure whether encoding such as EBCDIC is an argument
in favor of chars? But for example adding NEL to the line separators shouldn't be a problem in itself. In case an EBCDIC to Unicode
encoder insists on produce this code point for line separators.
Mild Shock schrieb am Mittwoch, 14. Februar 2024 um 12:48:13 UTC+1:
Getting rid of the Java-ism of compression was quite fun. We
tried to push it to its limits. So the tokenizer written 100%
in Prolog now preserves '\n', '\r' and '\r\n'. But then when we for
example generate HTML we have to replace the line terminator
by </div></div>. So how to do this without falling back to an
atom_split/2 with separator '\n' and do it correctly?
Its a litte bit tricky, like for example an input such as 'abc\r'
should have a line count of 2. So anything that views '\r' as
padding will go wrong. The following works fine:
/**
* sys_split_lines(L, I, O):
* The predicate succeeds in L with the lines
* of the input I and output O codes.
*/
% sys_split_lines(-List, +List, -List)
sys_split_lines([A|L]) -->
sys_split_line(X), {atom_codes(A,X)},
sys_split_more(L).
% sys_split_more(-List, +List, -List)
sys_split_more([A|L]) --> sys_convert_sep, !,
sys_split_line(X), {atom_codes(A,X)},
sys_split_more(L).
sys_split_more([]) --> [].
% sys_split_line(-List, +List, -List)
sys_split_line([X|L]) --> \+ sys_convert_sep, [X], !,
sys_split_line(L).
sys_split_line([]) --> [].
The above uses DCG (\+)/1 (% 7.14.11) banned (sic!) by Scryer Prolog. Where the line separators are kind of plugable, currently defined as follows,
but can be an arbitrary set of arbitrary long code combinations:
% sys_convert_sep(+List, -List)
sys_convert_sep --> [0'\r, 0'\n].
sys_convert_sep --> [0'\n].
sys_convert_sep --> [0'\r].
BTW: I think SWI-Prolog already implements some of the ideas
like encoding switching, which we do not have a demonstrator
for yet. But its a little bit weak and stubborn concerning line
terminators refuses to support CRLF.
Mild Shock schrieb am Mittwoch, 14. Februar 2024 um 12:38:56 UTC+1:
Now that we have implemented Async I/O for Dogelog Player,
we braught some of the principles already also to formerly
Jekejeke Prolog. Like we could get rid of compression.
But the venture has multiple goals:
- Goal 1: Async I/O, decompressed makes it simpler.
- Goal 2: Novacore, have a smaller core.
So what about Goal 2? One idea behind Goal 2 is to have
a single interface for text and binary streams. Even maybe
allow changing the encoding midflight.
This can be helpful for example in MIME multipart/mixed writing
or reading. So we have to get rid of some Java-ism and ISO-ism:
- ISO-ism: Have separate get_byte/[1,2] and get_code/[1,2],
we have to get rid of that. We are planning to realize binary streams
by an encoding such as "8bit" or "latin1" not yet sure how this
works out. Also the encoding should be mutable.
- Java-ism: In Java starting with BufferedReader and on top LineNumberReader line sparators such as '\n', '\r' and '\r\n'
area compressed into '\n'. We have also to get rid of that if want
to processs binary streams with the text API.
There is no encoding parameter yet in Dogelog Player, its all
hardcoded "utf8". And formerly Jekejeke Prolog has an
encoding parameter but its immutable. So we will work on this
as well.
Now asyncifying I/O of Dogelog Player for Python.
I guess we got our head around the equivalents
of our Java Surrogate async/await constructs
"Promise" and "Coroutine". The key utilities among
asyncio are asyncio.to_thread an asyncio.run_coroutine_threadsafe,
which seem toe especially made for the two use cases.
Namely what was our Java Surrogate "Promise" wrapper,
now looks like here, what a wonderful code gem:
async def console_promise(buf, stream):
try:
res = await asyncio.to_thread(blocking_readline, stream.data)
stream.buf = res
stream.pos = 0
except IOError as err:
register_signal(buf, map_stream_error(err))
def blocking_readline(data):
return data.readline()
And what was our Java Surrogate "Coroutine" wrapper,
now looks like here, what a wonderful code gem again:
def test_sys_http_server_on(args):
[...]
obj.func = lambda req, res: baby_come_back(
launch_async(clause, buf, [req, res]), loop)
def baby_come_back(coro, loop):
future = asyncio.run_coroutine_threadsafe(coro, loop)
return future.result()
LoL
But its much more complicated than what we did
for JDK 21. Also starting the HTTP server in a separate
thread is extremly frightening:
def test_http_server_listen(args):
[...]
thread = threading.Thread(target=blocking_forever, args=(obj,)) thread.start()
def blocking_forever(obj):
obj.serve_forever()
Its extremly frightening since the Thread docu warns us:
Thread-based parallelism
In CPython, due to the Global Interpreter Lock,
only one thread can execute Python code at once https://docs.python.org/3/library/threading.html
global interpreter lock
Also, the GIL is always released when doing I/O.
Past efforts to create a “free-threaded” interpreter
have not been successful https://docs.python.org/3/glossary.html#term-global-interpreter-lock
But still, our use of asyncio.to_thread and
asyncio.run_coroutine_threadsafe capitalizes on
that the GIL is nevertheless released during I/O,
and we don't see much issue here.
Mild Shock schrieb:
Now asyncifying I/O of Dogelog Player for Python.
I guess we got our head around the equivalents
of our Java Surrogate async/await constructs
"Promise" and "Coroutine". The key utilities among
asyncio are asyncio.to_thread an asyncio.run_coroutine_threadsafe,
which seem toe especially made for the two use cases.
Namely what was our Java Surrogate "Promise" wrapper,
now looks like here, what a wonderful code gem:
async def console_promise(buf, stream):
try:
res = await asyncio.to_thread(blocking_readline, stream.data)
stream.buf = res
stream.pos = 0
except IOError as err:
register_signal(buf, map_stream_error(err))
def blocking_readline(data):
return data.readline()
And what was our Java Surrogate "Coroutine" wrapper,
now looks like here, what a wonderful code gem again:
def test_sys_http_server_on(args):
[...]
obj.func = lambda req, res: baby_come_back(
launch_async(clause, buf, [req, res]), loop)
--- Synchronet 3.20a-Linux NewsLink 1.114def baby_come_back(coro, loop):
future = asyncio.run_coroutine_threadsafe(coro, loop)
return future.result()
LoL
- UTF-16 native strings: Does work well with the envisioned
"latin1" payload, the range 0..255 is encoded in one 16-bit word,
which is two bytes, but most programming language already
implement clever strings:
JEP 254: Compact Strings
We propose to change the internal representation of the String class
from a UTF-16 char array to a byte array plus an encoding-flag field.
The new String class will store characters encoded either as ISO-8859-1/ Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag will indicate
which encoding is used.
https://openjdk.org/jeps/254
Now I have rewritten the Tic-Tac-Toe example
to be 100% Prolog. Originally the Tic-Tac-Toe example
was conceived as a first stab in exploring the
foreign function interface (FFI) of Dogelog Player
inside the browser to register JavaScript functions
that do all kind of stuff with the DOM and events.
But now I have library(markup) for DOM generation
and library(react) for events. So I rewrote Tic-Tac-Toe
using these utilities, reducing the amount of
JavaScript logic to zero. Tic-Tac-Toe is now 100% Prolog.
What also went down the drain abusing consult_async()
to do the game intialization, instead I am now using
perform_async(). So the code went from dangerous.
await consult_async(":- ensure_loaded('browser.p').");
dangerous because of possible file name quoting issues.
To this where the file name is a string object and doesn't
need to be Prolog encoded, because we don't invoke a Prolog
text encoded query but a Prolog term:
await perform_async(new Compound("ensure_loaded", ["browser.p"]));
In has far we should make some Hydration experiment.
What is Hydration. Its a new buzzword around the partially
obsolete design, to have first the HTML body in a broswer
doument and then at the end of the HTML body some scripts:
r/webdev - What is Hydration? https://www.reddit.com/r/webdev/comments/xqd4i8/what_is_hydration/
The bundle end of HTML body design has usually takes
time time(html)+time(bundle). A better deisgn is unsing
async loading and the quasi-parallelism of the browser,
and load the bundle in the head if possible. The load time
is then around max(time(bundle), time(html))). which might
give better user experience. We should try the same
for our examples, load Dogelog Player in the head. But
the Prolog text loader is not yet task safe. So this might
involve some more work until we can try it.
Also we might nevertheless want to do a little hydration
when the HTML body is read, like wiring event handlers.
Mild Shock schrieb:
Now I have rewritten the Tic-Tac-Toe example
to be 100% Prolog. Originally the Tic-Tac-Toe example
was conceived as a first stab in exploring the
foreign function interface (FFI) of Dogelog Player
inside the browser to register JavaScript functions
that do all kind of stuff with the DOM and events.
But now I have library(markup) for DOM generation
and library(react) for events. So I rewrote Tic-Tac-Toe
using these utilities, reducing the amount of
JavaScript logic to zero. Tic-Tac-Toe is now 100% Prolog.
Note because of the await in front of the
perform_async() our loading doesn't create a task yet.
It will change the current load sequence. It will
only allow that tasks create before the await get
their share of work. We would need to add one of our
create_task utilities, or use the async option of a
script tag, as recommened here for MathJax v3:
/* put this in the head */
<script type="text/javascript" id="MathJax-script" async
src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-svg.js">
</script>
The async option of a script tag is described as:
"For module scripts, if the async attribute is
present then the scripts and all their dependencies
will be fetched in parallel to parsing and evaluated
as soon as they are available." https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script#async
Mild Shock schrieb:
What also went down the drain abusing consult_async()
to do the game intialization, instead I am now using
perform_async(). So the code went from dangerous.
await consult_async(":- ensure_loaded('browser.p').");
dangerous because of possible file name quoting issues.
To this where the file name is a string object and doesn't
need to be Prolog encoded, because we don't invoke a Prolog
text encoded query but a Prolog term:
await perform_async(new Compound("ensure_loaded", ["browser.p"])); >>
In has far we should make some Hydration experiment.
What is Hydration. Its a new buzzword around the partially
obsolete design, to have first the HTML body in a broswer
doument and then at the end of the HTML body some scripts:
r/webdev - What is Hydration?
https://www.reddit.com/r/webdev/comments/xqd4i8/what_is_hydration/
The bundle end of HTML body design has usually takes
time time(html)+time(bundle). A better deisgn is unsing
async loading and the quasi-parallelism of the browser,
and load the bundle in the head if possible. The load time
is then around max(time(bundle), time(html))). which might
give better user experience. We should try the same
for our examples, load Dogelog Player in the head. But
the Prolog text loader is not yet task safe. So this might
involve some more work until we can try it.
Also we might nevertheless want to do a little hydration
when the HTML body is read, like wiring event handlers.
Mild Shock schrieb:
Now I have rewritten the Tic-Tac-Toe example
to be 100% Prolog. Originally the Tic-Tac-Toe example
was conceived as a first stab in exploring the
foreign function interface (FFI) of Dogelog Player
inside the browser to register JavaScript functions
that do all kind of stuff with the DOM and events.
But now I have library(markup) for DOM generation
and library(react) for events. So I rewrote Tic-Tac-Toe
using these utilities, reducing the amount of
JavaScript logic to zero. Tic-Tac-Toe is now 100% Prolog.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 997 |
Nodes: | 10 (0 / 10) |
Uptime: | 227:08:17 |
Calls: | 13,046 |
Calls today: | 1 |
Files: | 186,574 |
Messages: | 3,292,818 |