i was forced into the role of data analyst a long time ago. now i am obsessed with finding cool datasets to play around with. i generally work with spatial data, but i'd love to hear about any type. i am not a super data scientist or anything, but we do a lot of interesting stuff at work with all the free imagery data available nowadays. the possibilities are endless and terrifying. anyone out there doing similar work in environmental, public health or policy work?
I'd like to dig into this database of prisons created by the university of michigan at some point in life: https://cjars.org/
― Comfortably numbnuts (Heez), Monday, 11 March 2024 16:21 (three months ago) link
You should sign up for Data is Plural, a longtime newsletter from a former data editor from Buzzfeed – weekly recommendations https://www.data-is-plural.com/ and a vast archive
I like to always see what the people at the Pudding does, in general. It gives out lots of ideas: https://pudding.cool/
And in general, following graphic desks in publications is always good as well – there used to be a Github page that would collect all Twitter handles/Repos/Portfolios. maybe this one? https://github.com/wbkd/awesome-interactive-journalism
Information Is Beautiful awards/Malofiej/SND also good places to check. There's also a famous newsletter more on the data science/academic world...
― fpsa, Monday, 11 March 2024 18:29 (three months ago) link
and you can always scrape stuff easily these days. I have a very simple XML parser to scrape data from Amoeba – I'm trying to do something with the 'What's in my bag?' videos, and it's pretty cool to see it all in one place. Trying to do the same with The Quietus' Baker's dozen.
― fpsa, Monday, 11 March 2024 18:31 (three months ago) link
Yeah, apparently with some really large datasets there can be some patterns that are hard to notice visually but that jump right out when you listen....
― m0stly clean (Slowsquatch), Monday, 11 March 2024 19:49 (three months ago) link
Nice. I also work with a lot of spatial data, mostly for work (UK-oriented socioeconomic and demographic type data - for which I have a huge directory if anyone needs it), occasionally for fun or messing around to learn new R code or test new new visualisation ideas.
Some ideas:
Kaggle - seems to have a broad selection of stuff (albeit of varying quality); my favourite to date is probably data on the Eurovision Song Contest.
ONS internal migation data - one of my favourites to tinker with (https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/internalmigrationinenglandandwales), covering where people have moved to/from at local authority level in England and Wales going back several years.
Movebank.org - animal collar/satellite tracking data. It can be difficult to find datasets with large sample sizes but some are pretty robust. I got a lot of mileage out of some Arctic Fox data a couple years ago.
This github repo has a bunch of map layers, originally from the 70s and digitised in the 90s, for tree species of North America (https://github.com/wpetry/USTreeAtlas)
London Fire Brigade animal callouts - I've never managed to do much with this, but LFB has a years-long database of animal rescue callouts, which includes some real oddities like a bearded dragon getting stuck in a car engine and stuff like people getting stuck in trees after trying to retrieve cats stuck in trees (https://data.london.gov.uk/dataset/animal-rescue-incidents-attended-by-lfb)
Consumer Data Research Centre - buckets of interesting UK data including really granular estimates of fuel poverty, food insecurity, resident churn, energy efficiency of properties...
― salsa shark, Tuesday, 12 March 2024 22:02 (three months ago) link