http://www.slideshare.net/emileifrem/neo4j-the-benefits-of-graph-databases-oscon-2009
So I'm probably going to pitch a couple of dudes a ridiculous idea this week and figured why not ask ilcomputers while I'm at it:
- starting with neo4j 'cause it has a wiki already and every other graph DB seems to be strictly commercial/proprietary
- want to begin with 4 billion nodes (one for every IPv4 address)
- relate those nodes to "label" nodes, to form lists (whitelists, blacklists, country codes, ASNs, etc); millions of label nodes possible
- relate those lists to "source" nodes (users, cron jobs, websites) and the sources into source families
- tricky part: relate multiple nodes to "timestamp" nodes, so that a unix time can be associated with an IP' nodes relationship to a label node, e.g. say we have a label node that is just a domain name, and a certain list of addressed resolved from that domain at x time, but did not as of y time (this can be done with a traverser. I am pretty sure.)
- further along down the road, additional relationships for other attributes describing the various nodes; the idea ultimately being I can do almost all of my homework on a network device using a couple of traversals instead of essentially just grepping billions of lines of binary data
So aside from a few presentations and anecdotes from people I definitely don't know, has anybody here worked with Neo4j, or something like it? What eventually kills a graph DB? What do I need to watch out for?
― El Tomboto, Sunday, 15 November 2009 05:41 (fifteen years ago) link
So I'm probably going to pitch a couple of nodes a ridiculous idea this week and figured why not ask ilcomputers while I'm at it:
- starting with neo4j 'cause it has a wiki already and every other graph DB seems to be strictly commercial/proprietary
- want to begin with 4 billion dudes (one for every IPv4 address)
- relate those dudes to "label" dudes, to form lists (whitelists, blacklists, country codes, ASNs, etc); millions of label dudes possible
- relate those lists to "source" dudes (users, cron jobs, websites) and the sources into source families
- tricky part: relate multiple dudes to "timestamp" dudes, so that a unix time can be associated with an IP' dudes relationship to a label dude, e.g. say we have a label dude that is just a domain name, and a certain list of addresses resolved from that domain at x time, but did not as of y time (this can be done with a traverser. I am pretty sure.)
- further along down the road, additional relationships for other attributes describing the various dudes; the idea ultimately being I can do almost all of my homework on a network device using a couple of traversals instead of essentially just grepping billions of lines of binary data
― El Tomboto, Sunday, 15 November 2009 07:32 (fifteen years ago) link
totally ignorant in this particular matter, sorry dude.
― caek, Sunday, 15 November 2009 13:01 (fifteen years ago) link
marko rodriguez's deck about graph DBs seems like a good overview (to the eye of this particular dangerous amateur) - I'm gonna holler at our grant mofos to see if there's existing research in this area that we can piggyback on
― El Tomboto, Tuesday, 17 November 2009 08:04 (fifteen years ago) link
four months pass...