•1 min read•from Machine Learning
A new dataset with more that 100M hi-quality, curated images, with captions and meta data! [P]
Hello everyone.
The new dataset is named MONET, is Apache 2.0 and available on HF:
https://huggingface.co/datasets/jasperai/monet
MONET is open, Apache 2.0-licensed image–text dataset. It was built from 2.9 billion images and refined to 104.9 million high-quality samples.
We are also publishing a paper that explains how the dataset was created if you are curious and 3 compagnions projects
- A umap to visualize the distribution
- A retreival tool to do text or image search
- A codebase to train T2i model based on MONET
Hope this will be usefull!
[link] [comments]
Want to read more?
Check out the full article on the original site
Tagged with
#large dataset processing
#rows.com
#cloud-based spreadsheet applications
#financial modeling with spreadsheets
#big data management in spreadsheets
#generative AI for data analysis
#conversational data analysis
#Excel alternatives for data analysis
#real-time data collaboration
#intelligent data visualization
#data visualization tools
#enterprise data management
#big data performance
#data analysis tools
#data cleaning solutions
#MONET
#dataset
#images
#text
#Apache 2.0