Datasets for data mining free download xls
File Notation Areas are any proposed land transaction, or The data is held in GDA94 latitude and longitude coordinates. This dataset was This mineral drillcore is usually obtained from mineral exploration companies and is These maps have been United States Census Bureau. Web Data Commons , structured data from the Common Crawl, the largest public web corpus. Webhose free datasets Wikiposit , a virtual amalgamation of mostly financial data from many different sites, allowing users to merge data from different sources Wolfram Alpha disease and patient level data.
AI , connect your data to many of 3. Subscribe to KDnuggets to get free access to Partners plan. Yahoo Sandbox datasets , Language, Graph, Ratings, Advertising and Marketing, Competition Yelp Dataset , a subset of Yelp businesses, reviews, and user data for use in personal, educational, and academic purposes. More Recent Stories Difference between distributed learning versus federated learn KDnuggets Home » Datasets.
Sign Up. Subscribe to KDnuggets. Submit a blog Win a Reward! Kaggle launched in with a number of machine learning competitions, which subsequently solved problems for the likes of NASA and Ford. It has since evolved into a renowned open data platform, offering cloud-based collaboration for data scientists, as well as educational tools for teaching artificial intelligence and data analysis techniques …plus, of course, tonnes of great datasets covering almost any topic you can imagine. In , the US Government made all its data publicly available.
With over , datasets covering everything from climate change to crime, you can lose yourself in the database for hours. For a government website, it has some surprisingly user-friendly search functions, including the ability to drill down by geographical area, organization type, and file format.
Search results are also clearly labeled at federal, state, county, and city levels. Type of data: Mostly business and finance Data compiled by: Datahub Access: Mostly free, no registration required Sample dataset: Monthly gold prices since The goal of many data analysts is to help drive savvy business decisions. However, as online services generate more and more data, an increasing amount is generated in real-time, and not available in data set form. Some examples of this include data on tweets from Twitter , and stock price data.
Twitter has a good streaming API, and makes it relatively straightforward to filter and stream tweets. You can get started here. There are tons of options here — you could figure out what states are the happiest, or which countries use the most complex language. We also recently wrote an article to get you started with the Twitter API here.
Github has an API that allows you to access repository activity and code. You can get started with the API here. The options are endless — you could build a system to automatically score code quality, or figure out how code evolves over time in large projects. Quantopian is a site where you can develop, test, and operationalize stock trading algorithms. In order to help you do that, they give you access to free minute by minute stock price data. You could build a stock price prediction algorithm.
You could use these calls to build up a set of historical weather data, and make predictions about the weather tomorrow. In this post, we covered good places to find data sets for any type of data science project.
We hope that you find something interesting that you want to sink your teeth into! Please let us know! At Dataquest , our interactive guided projects are designed to help you start building a data science portfolio to demonstrate your skills to employers and get a job in data.
Share 0. Tweet 0. Published: September 16, About the author. Vik Paruchuri.
0コメント