Created
June 24, 2010 11:43
-
-
Save PharkMillups/451346 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| danoyoung # is it possible to have a "nested bucket"? Something like /<bucket>/2009/<key> | |
| benblack # buckets don't nest | |
| danoyoung # ok...that's what I thought thanx. | |
| benblack # np | |
| seancribbs # danoyoung: you can include slashes in the bucket or | |
| key name if you escape them | |
| danoyoung # the reason I ask is that we have satellite data that | |
| we want to model where the <bucket> would be the satellite, and | |
| then all the satellite data organized in a way over <year>/<key>, which | |
| would be something like 2009/001 (year/day) | |
| benblack # you can still do that as seancribbs described, it | |
| just wouldn't be in another bucket | |
| danoyoung # It's more for organizational reasons...we have 600 | |
| (and growing) data sets from various satellites and would like | |
| to organize them by satellite and then a date. | |
| I'll look into that, thanx . | |
| seancribbs # so, compose the date and satellite into a bucket | |
| name (or key name) | |
| danoyoung # yea, that's one option.... probably something like | |
| quikscat_2008, quikscat_2009, etc... | |
| seancribbs # right, or even quikscat_201006 | |
| danoyoung # We have a measurement for every day of that | |
| particular year, so I was thinking about a key that would represent the day of year.... | |
| seancribbs # that would work too | |
| danoyoung # so something like quickscat_2009/001, etc... | |
| seancribbs # the important thing is being able to derive the key, or | |
| have a way to find it easily sounds like your scheme will work well | |
| danoyoung # most of the data would be looked up via dataset, year, | |
| and then a particular day w/n that year.... | |
| danoyoung # any other advance queries,, like "show me all of the | |
| Vertical polarization measurements for the quickscat dataset for this | |
| time range X" would be handled via map/reduce...or at least that's | |
| what I'm thinking currently. I plan on front ending Riak with Elastic | |
| Search for these types of queries...until Riak search comes out and I | |
| see what that looks like. | |
| seancribbs # and that would be easy to specify the inputs for | |
| danoyoung # yea, we basically have very different "schemas" based on | |
| datasets, and trying to make a RDBMS (postgres) work for this type of | |
| data has been painful.. RIak seems to be a nice fit. | |
| in addition we have close to a petabyte of data that will need to be | |
| "query'able". right now we can't do it. | |
| seancribbs # yes, that would be hard | |
| danoyoung # it's more the schema constraints...the amount of data | |
| doesn't help either....but with each new satellite NASA launches, | |
| we get an entirely different data set shape....so It's challenging | |
| to support the unknown schema ahead of time. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment