|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
fast JSON parser in CI'm considering wrapping one of the many fast JSON parsers written in
C, using the Erlang FFI. I'm still just learning how the pieces fit together - got to hello world via this tutorial: http://www.wagerlabs.com/blog/2008/02/erlang-ffi---in.html The first hurdle I've found is that it might not be all the efficient to use C for a JSON parser, as the communication between Erlang and C function is limited to buffers. If communication is essentially limited to strings, I'll end up writing a parser in Erlang for whatever I concoct on the C side... so I may be better off working to speed up some of the existing pure Erlang JSON parser implementations. But perhaps I'm missing something. Is there a way to construct complex Erlang data structures (nested tuples and lists, with binary, atom, integer and float components) directly inside C, and then return the constructed object to Erlang where it can do things like participate in pattern matchers, etc? Thanks! Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CChris,
I'm currently using something just as you are describing. I have erl_to_json() and json_to_erl() functions in C which convert ETERM -> JSON and JSON -> ETERM. Both are recursive C functions which make use of erl_eterm from erl_interface. The erl_interface library allows you to construct all the Erlang term types in C, marshall/encode them, and pass them off directly to Erlang. http://www.erlang.org/doc/apps/erl_interface/ref_man_erl_interface_frame.htm l Beware: I just posted to erlang-bugs an issue I'm having with this. It works just fine for smaller sizes, but we're having an issue crop up when we attempt to unmarshall/decode a large ETERM in C using erl_decode() from erl_interface. It is a reproducible "bug" as the problem only exists in C using erl_interface, Erlang has no issues with it at all. Hope that helps. Jonathan Gray Streamy Inc. -----Original Message----- From: erlang-questions-bounces@... [mailto:erlang-questions-bounces@...] On Behalf Of Chris Anderson Sent: Tuesday, July 22, 2008 4:30 PM To: Erlang Questions Subject: [erlang-questions] fast JSON parser in C I'm considering wrapping one of the many fast JSON parsers written in C, using the Erlang FFI. I'm still just learning how the pieces fit together - got to hello world via this tutorial: http://www.wagerlabs.com/blog/2008/02/erlang-ffi---in.html The first hurdle I've found is that it might not be all the efficient to use C for a JSON parser, as the communication between Erlang and C function is limited to buffers. If communication is essentially limited to strings, I'll end up writing a parser in Erlang for whatever I concoct on the C side... so I may be better off working to speed up some of the existing pure Erlang JSON parser implementations. But perhaps I'm missing something. Is there a way to construct complex Erlang data structures (nested tuples and lists, with binary, atom, integer and float components) directly inside C, and then return the constructed object to Erlang where it can do things like participate in pattern matchers, etc? Thanks! Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in COn Tue, Jul 22, 2008 at 8:09 PM, Jonathan Gray <jlist@...> wrote:
> Chris, > > I'm currently using something just as you are describing. I have > erl_to_json() and json_to_erl() functions in C which convert ETERM -> JSON > and JSON -> ETERM. Both are recursive C functions which make use of > erl_eterm from erl_interface. Awesome to hear that you are building this already. Are you interested in open sourcing it? My goal is to build a faster JSON parser for CouchDB. I don't mind doing the work, but if you already have something happening, why duplicate work? Thanks for the technical feedback. I'll definitely take a look at the marshaling stuff, but I'd rather look at your json stuff. :) Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CChris,
Considering your primary goal is speed, you definitely don't want to use my implementation. I use an intermediate translation into a custom typed JSON tree. It's a poor approach when you need it to be as fast as possible. It's done this way for simplicity reasons as the JSON -> tree (and vice versa) already existed when we wanted to then go to Erlang/ETERM. If a direct parser existed I'd certainly use it. I'm still exploring the issues with segfaulting during decoding. It's been suggested to me to use ei rather than erl_interface, but it's quite a bit more involved. An ideal JSON -> Erl parser would certainly be using ei directly, so I'd dig around there for how you can build/decode Erlang from within C. http://www.erlang.org/doc/man/ei.html Jonathan -----Original Message----- From: erlang-questions-bounces@... [mailto:erlang-questions-bounces@...] On Behalf Of Chris Anderson Sent: Tuesday, July 22, 2008 10:19 PM To: Jonathan Gray Cc: erlang-questions@... Subject: Re: [erlang-questions] fast JSON parser in C On Tue, Jul 22, 2008 at 8:09 PM, Jonathan Gray <jlist@...> wrote: > Chris, > > I'm currently using something just as you are describing. I have > erl_to_json() and json_to_erl() functions in C which convert ETERM -> JSON > and JSON -> ETERM. Both are recursive C functions which make use of > erl_eterm from erl_interface. Awesome to hear that you are building this already. Are you interested in open sourcing it? My goal is to build a faster JSON parser for CouchDB. I don't mind doing the work, but if you already have something happening, why duplicate work? Thanks for the technical feedback. I'll definitely take a look at the marshaling stuff, but I'd rather look at your json stuff. :) Chris -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in COn Wed, Jul 23, 2008 at 11:31 AM, Jonathan Gray <jlist@...> wrote:
> Chris, > > Considering your primary goal is speed, you definitely don't want to use my > implementation. Thanks then for the info. My plan was to use http://www.lloydforge.org/projects/yajl/ as a fast JSON parser, and somehow get the results to Erlang. It looks like it may be more challenging than I'd thought. Out of curiosity, at roughly what size of input do you start to see the segfaults? I'll probably start with the ei interface if that is safer, but it will be nice to know how hard I have to push it to know that I've done the job safely. -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CThat's the interesting thing.
I've successfully encoded and decoded >1.5MB binary chunks of erlang terms when I manually create them using erl_interface. However when I get a big chunk (around 80-120K) directly from Erlang as binary (using term_to_binary in erlang), I'm unable to decode it using erl_interface erl_decode, though it can be decoded fine from within Erlang. I haven't received any responses as to a fix and was redirected towards using ei instead of (seemingly deprecated but not documented as deprecated) erl_interface. I'm going to do some testing with ei and will let you know if the problem goes away. Also, I'll take a look at Yajl now. Thanks. JG -----Original Message----- From: jchris@... [mailto:jchris@...] On Behalf Of Chris Anderson Sent: Wednesday, July 23, 2008 10:20 AM To: Jonathan Gray Cc: erlang-questions@... Subject: Re: [erlang-questions] fast JSON parser in C On Wed, Jul 23, 2008 at 11:31 AM, Jonathan Gray <jlist@...> wrote: > Chris, > > Considering your primary goal is speed, you definitely don't want to use my > implementation. Thanks then for the info. My plan was to use http://www.lloydforge.org/projects/yajl/ as a fast JSON parser, and somehow get the results to Erlang. It looks like it may be more challenging than I'd thought. Out of curiosity, at roughly what size of input do you start to see the segfaults? I'll probably start with the ei interface if that is safer, but it will be nice to know how hard I have to push it to know that I've done the job safely. -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in COn Wed, Jul 23, 2008 at 1:49 PM, Jonathan Gray <jlist@...> wrote:
> However when I get a big chunk (around 80-120K) directly from Erlang as > binary (using term_to_binary in erlang), I'm unable to decode it using > erl_interface erl_decode, though it can be decoded fine from within Erlang. Good to know - CouchDB's Erlang -> JSON encoding is fast enough to not need help from C. I'm just working on making the JSON -> Erlang fast enough, so as long as I can get string buffers over to C in the first place, it sounds like you're not having a problem moving data from C back to Erlang. Time to buckle down and code! -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CYou might want to go with the ei interface rather than the erl_interface
since it is not as clumsy to use. Further, you might want to have a look at the ejabberd expat driver and just replace the expat stuff with your json SAX events. -Martin Chris Anderson wrote: > On Wed, Jul 23, 2008 at 1:49 PM, Jonathan Gray <jlist@...> wrote: >> However when I get a big chunk (around 80-120K) directly from Erlang as >> binary (using term_to_binary in erlang), I'm unable to decode it using >> erl_interface erl_decode, though it can be decoded fine from within Erlang. > > Good to know - CouchDB's Erlang -> JSON encoding is fast enough to not > need help from C. I'm just working on making the JSON -> Erlang fast > enough, so as long as I can get string buffers over to C in the first > place, it sounds like you're not having a problem moving data from C > back to Erlang. > > Time to buckle down and code! > _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CSince JSON seem to be ubiquitous is seems to me that there would be
a strong case for a couple of new BIFs, term_to_json and json_to_term. these would work like binary_to_term and term_to_binary the difference being that instead of converting to a binary containing the external term format, we convert to a binary containing a JSON encoded string. I imaging that modeling this code on the existing term_to_binary and binary_to_term code would not be impossibly difficult (you have to make it reentrant and not to not hog the CPU for too long ... and so on ...) And yes - it is pretty horrid increasing the number of BIFs but this might just be a useful addition. This would ease seamless integration with a lot of external programs :-) /Joe Armstrong On Thu, Jul 24, 2008 at 11:02 AM, Martin Carlson <martin@...> wrote: > You might want to go with the ei interface rather than the erl_interface > since it is not as clumsy to use. Further, you might want to have a look > at the ejabberd expat driver and just replace the expat stuff with your > json SAX events. > > -Martin > > Chris Anderson wrote: >> On Wed, Jul 23, 2008 at 1:49 PM, Jonathan Gray <jlist@...> wrote: >>> However when I get a big chunk (around 80-120K) directly from Erlang as >>> binary (using term_to_binary in erlang), I'm unable to decode it using >>> erl_interface erl_decode, though it can be decoded fine from within Erlang. >> >> Good to know - CouchDB's Erlang -> JSON encoding is fast enough to not >> need help from C. I'm just working on making the JSON -> Erlang fast >> enough, so as long as I can get string buffers over to C in the first >> place, it sounds like you're not having a problem moving data from C >> back to Erlang. >> >> Time to buckle down and code! >> > > _______________________________________________ > erlang-questions mailing list > erlang-questions@... > http://www.erlang.org/mailman/listinfo/erlang-questions > erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CJoe,
It would certainly ease our integration. JSON is certainly becoming ubiquitous and tighter integration with Erlang would be very useful. One note, in my current implementation and in general, there's no real direct mapping for JSON objects/dictionaries. While Erlang does have dicts, we were never able to "create" them using erl_interface/ei, though I'm sure it's possible by dissecting the dict representation. I use a {obj, [{},{}]} type representation now to handle JSON objects in Erlang. Any thoughts? +1 on new JSON BIFs :) Willing to help in any way I can. Jonathan Gray -----Original Message----- From: erlang-questions-bounces@... [mailto:erlang-questions-bounces@...] On Behalf Of Joe Armstrong Sent: Thursday, July 24, 2008 3:49 AM To: Martin Carlson Cc: erlang-questions@... Subject: Re: [erlang-questions] fast JSON parser in C Since JSON seem to be ubiquitous is seems to me that there would be a strong case for a couple of new BIFs, term_to_json and json_to_term. these would work like binary_to_term and term_to_binary the difference being that instead of converting to a binary containing the external term format, we convert to a binary containing a JSON encoded string. I imaging that modeling this code on the existing term_to_binary and binary_to_term code would not be impossibly difficult (you have to make it reentrant and not to not hog the CPU for too long ... and so on ...) And yes - it is pretty horrid increasing the number of BIFs but this might just be a useful addition. This would ease seamless integration with a lot of external programs :-) /Joe Armstrong On Thu, Jul 24, 2008 at 11:02 AM, Martin Carlson <martin@...> wrote: > You might want to go with the ei interface rather than the erl_interface > since it is not as clumsy to use. Further, you might want to have a look > at the ejabberd expat driver and just replace the expat stuff with your > json SAX events. > > -Martin > > Chris Anderson wrote: >> On Wed, Jul 23, 2008 at 1:49 PM, Jonathan Gray <jlist@...> wrote: >>> However when I get a big chunk (around 80-120K) directly from Erlang as >>> binary (using term_to_binary in erlang), I'm unable to decode it using >>> erl_interface erl_decode, though it can be decoded fine from within >> >> Good to know - CouchDB's Erlang -> JSON encoding is fast enough to not >> need help from C. I'm just working on making the JSON -> Erlang fast >> enough, so as long as I can get string buffers over to C in the first >> place, it sounds like you're not having a problem moving data from C >> back to Erlang. >> >> Time to buckle down and code! >> > > _______________________________________________ > erlang-questions mailing list > erlang-questions@... > http://www.erlang.org/mailman/listinfo/erlang-questions > erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CI second this. I have not yet gotten to the point in my project where I need this. But it is certainly a need looming out there on the horizon. With Yaws/Erlyweb/MochiWeb/etc becoming more prevalent and important. I could see this being a huge win for ajax/comet libraries.
On Thu, Jul 24, 2008 at 1:19 PM, Jonathan Gray <jlist@...> wrote: Joe, -- An idea that is not dangerous is unworthy of being called an idea at all. -- Oscar Wilde _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CThere are pretty decent pure Erlang JSON libraries available. They're
pretty fast, relatively speaking, and they certainly don't crash the Erlang interpreter ;) I would worry about C when you actually need to. 2008/7/24 Rick R <rick.richardson@...>: > I second this. I have not yet gotten to the point in my project where I need > this. But it is certainly a need looming out there on the horizon. With > Yaws/Erlyweb/MochiWeb/etc becoming more prevalent and important. I could see > this being a huge win for ajax/comet libraries. > > On Thu, Jul 24, 2008 at 1:19 PM, Jonathan Gray <jlist@...> wrote: >> >> Joe, >> >> It would certainly ease our integration. JSON is certainly becoming >> ubiquitous and tighter integration with Erlang would be very useful. >> >> One note, in my current implementation and in general, there's no real >> direct mapping for JSON objects/dictionaries. While Erlang does have >> dicts, >> we were never able to "create" them using erl_interface/ei, though I'm >> sure >> it's possible by dissecting the dict representation. I use a {obj, >> [{},{}]} >> type representation now to handle JSON objects in Erlang. Any thoughts? >> >> +1 on new JSON BIFs :) Willing to help in any way I can. >> >> Jonathan Gray >> >> -----Original Message----- >> From: erlang-questions-bounces@... >> [mailto:erlang-questions-bounces@...] On Behalf Of Joe Armstrong >> Sent: Thursday, July 24, 2008 3:49 AM >> To: Martin Carlson >> Cc: erlang-questions@... >> Subject: Re: [erlang-questions] fast JSON parser in C >> >> Since JSON seem to be ubiquitous is seems to me that there would be >> a strong case for a couple of new BIFs, term_to_json and json_to_term. >> these would work like binary_to_term and term_to_binary the difference >> being that >> instead of converting to a binary containing the external term format, >> we convert >> to a binary containing a JSON encoded string. >> >> I imaging that modeling this code on the existing term_to_binary and >> binary_to_term >> code would not be impossibly difficult (you have to make it reentrant >> and not to >> not hog the CPU for too long ... and so on ...) >> >> And yes - it is pretty horrid increasing the number of BIFs but this >> might just be >> a useful addition. >> >> This would ease seamless integration with a lot of external programs :-) >> >> /Joe Armstrong >> >> On Thu, Jul 24, 2008 at 11:02 AM, Martin Carlson >> <martin@...> wrote: >> > You might want to go with the ei interface rather than the erl_interface >> > since it is not as clumsy to use. Further, you might want to have a look >> > at the ejabberd expat driver and just replace the expat stuff with your >> > json SAX events. >> > >> > -Martin >> > >> > Chris Anderson wrote: >> >> On Wed, Jul 23, 2008 at 1:49 PM, Jonathan Gray <jlist@...> >> >> wrote: >> >>> However when I get a big chunk (around 80-120K) directly from Erlang >> >>> as >> >>> binary (using term_to_binary in erlang), I'm unable to decode it using >> >>> erl_interface erl_decode, though it can be decoded fine from within >> Erlang. >> >> >> >> Good to know - CouchDB's Erlang -> JSON encoding is fast enough to not >> >> need help from C. I'm just working on making the JSON -> Erlang fast >> >> enough, so as long as I can get string buffers over to C in the first >> >> place, it sounds like you're not having a problem moving data from C >> >> back to Erlang. >> >> >> >> Time to buckle down and code! >> >> >> > >> > _______________________________________________ >> > erlang-questions mailing list >> > erlang-questions@... >> > http://www.erlang.org/mailman/listinfo/erlang-questions >> > >> _______________________________________________ >> erlang-questions mailing list >> erlang-questions@... >> http://www.erlang.org/mailman/listinfo/erlang-questions >> >> _______________________________________________ >> erlang-questions mailing list >> erlang-questions@... >> http://www.erlang.org/mailman/listinfo/erlang-questions > > > > -- > An idea that is not dangerous is unworthy of being called an idea at all. -- > Oscar Wilde > _______________________________________________ > erlang-questions mailing list > erlang-questions@... > http://www.erlang.org/mailman/listinfo/erlang-questions > erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in COn Thu, Jul 24, 2008 at 4:43 PM, Bob Ippolito <bob@...> wrote:
> There are pretty decent pure Erlang JSON libraries available. They're > pretty fast, relatively speaking, and they certainly don't crash the > Erlang interpreter ;) I would worry about C when you actually need to. I researched some into the C way of doing things, and wasn't sure how much overhead the ei communication would consume. Currently I'm working on a leex/yecc parser for JSON. The output format is quite flexible, so for now I'm just building it to pass the CouchDB cjson test_suite. Once it is working, it should be trivial to alter the format to fit an agreed-upon convention. I'll probably finish in the next day or two, and then I'll have an idea of whether using leex/yecc to generate Erlang provides a big speed boost. If it doesn't, at least I had fun! Joe's BIF idea does seem like the long-term solution. -- Chris Anderson http://jchris.mfdz.com _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in CHi Joe,
Joe Armstrong wrote: > Since JSON seem to be ubiquitous is seems to me that > there would be a strong case for a couple of new BIFs, > term_to_json and json_to_term. I'm a great fan of your principle of removing something for everything you add to the language. What would you want to remove in exchange for JSON bifs? JSON doesn't seem ubiquitous to me, but maybe I haven't been getting out enough lately. Cheers, Dominic Williams http://dominicwilliams.net ---- _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in COn Thu, Jul 24, 2008 at 2:53 PM, Chris Anderson <jchris@...> wrote:
> On Thu, Jul 24, 2008 at 4:43 PM, Bob Ippolito <bob@...> wrote: >> There are pretty decent pure Erlang JSON libraries available. They're >> pretty fast, relatively speaking, and they certainly don't crash the >> Erlang interpreter ;) I would worry about C when you actually need to. > > I researched some into the C way of doing things, and wasn't sure how > much overhead the ei communication would consume. Currently I'm > working on a leex/yecc parser for JSON. The output format is quite > flexible, so for now I'm just building it to pass the CouchDB cjson > test_suite. Once it is working, it should be trivial to alter the > format to fit an agreed-upon convention. > > I'll probably finish in the next day or two, and then I'll have an > idea of whether using leex/yecc to generate Erlang provides a big > speed boost. If it doesn't, at least I had fun! > > Joe's BIF idea does seem like the long-term solution. I'd be curious to know if leex/yecc can do any better than mochijson2 (which is written by hand), especially considering that it uses binaries instead of strings. http://code.google.com/p/mochiweb/source/browse/trunk/src/mochijson2.erl -bob _______________________________________________ erlang-questions mailing list erlang-questions@... http://www.erlang.org/mailman/listinfo/erlang-questions |
|
|
Re: fast JSON parser in C |