Just in the same way that Google spiders the Internet (automatically crawling through millions of URLs), there are some spiders that have yet to be made that would crawl existing exciting datasets with the hope of producing some useful utility. Here is a list of spider types that I’ve come up, with from purely informational to physical at various scales. I have attempted to put them in order of when they may be achieved.
Internet spider- Google / Yahoo / Bing etc
Internet vertical spider - there are large range of companies that scrape/spider/record a specific vertical/horizontal dataset
Space spider - We are already collecting tons of data with telescopes from CMB data to, gravity maps and physical star / asteroid positions. Given the resolution required to capture all data in the universe, we still have long, expensive way to go here.
Geo spiders - we use satellite imagery to collect data about our planet’s surface and now can use drones to get higher resolution/more real time information. We also collect data on the sea floors. Lidar will give us even better laser based data sets of urban / complex areas. Velodyne currently adds individual data sets to an entire cloud map from all incoming sources (e.g. Google self driving cars) to help build more accurate maps.
Product spider - there will be ramp in the 3D data we have on products / objects given the current scanning revolution that is closely tied to the 3D printing revolution.
Bio spider / Gene spider - Robust, nuclear battery based soil/species sampler that captures new genetic data by crawling physically around the earth, relaying data back wirelessly. The objective here would be to find new unknown species and genes that may be useful in synthetic biology or medicine. Credit to Sumon Sadhu for coming up with the gene spider. Alternatively, a modified Bdelloidea could “steal” DNA from its targets.
Nanotech spider / materials spider - this spider would essentially be some kind of scanning electron microscope which would look for exotic and interesting surfaces/materials/objects at the nano level. The data would be useful for the nanotech industry, just in the same way that we have scanned gecko feet to inspire new sticky surfaces.
Brain spider - there is a ton of data that is yet to be collected locked in our brains - thoughts, ideas, emotions, memories. For this we need mobile, high resolution, non invasive techniques. Major privacy issues on this spider.
Particle spider - this spider would sit listening for particle interactions on a local level and constantly report back. Would need huge amounts of storage and IO bandwidth.
Abstract dataset spider - this spider would seek to record all informational data sets connected to the Internet. In a sense this is the Internet itself.
Universe spider - this hypothetical spider would be able to collect all data. There are possible limits to how this machine would be able to store all information on the universe inside it and whether it would be even possible to record certain areas of universe and/or types of interactions or data.
Time spider - this hypothetical spider would be not be limited by recording what is happening/happened already. It would be able to record data in the future and the past.
Multiverse spider - this hypothetical spider would be able to record data across multiple universes.