Skip to content

This repository contains a Golang tool to streamline the process of creating Machine Learning Datasets using public Social APIs.

License

Notifications You must be signed in to change notification settings

KAEVO/MLDatasetCreator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLDatasetCreator\nA Golang tool that provides efficient creation of Machine Learning Datasets from public Social Network APIs. Its main function is to manage concurrent operations and save data, offering a user-friendly and intuitive API.\n\n## Overview\nSocial APIs are frequently used for creating datasets for training Machine Learning models. For instance, models like Tweet2Vec aim to extract features or create embeddings from such data.\n\nMany models, particularly NLP-oriented ones, can benefit from a large repository of structured text that may or may not carry labeling.\n\nOften, creating such datasets takes time away from feature engineering and model formulation work. This tool aims to simplify the process.\n\nAlthough the author is new to the Machine Learning space, they are open to feedback and hope to share something genuinely useful in the near future.\n\n## TODO\n- [x] Feature-based API\n- [x] Concurrency using Goroutines, channels\n- [ ] Caching operations to prevent repeated requests\n- [ ] Save data in various formats (CSV, JSON)\n- [ ] Support different API data formats (JSON, XML)\n- [ ] Authentication\n- [ ] Command line functionality\n- [ ] Demo\n- Additional ideas and suggestions are welcome

About

This repository contains a Golang tool to streamline the process of creating Machine Learning Datasets using public Social APIs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages