Tune-A-Video-library (Tune a video concepts library)

not-lain

posted an update 2 days ago

Post

635

we now have more than 2000 public AI models using ModelHubMixin🤗

prithivMLmods

posted an update 3 days ago

Post

1867

ChemQwen-vL [ Qwen for Chem Vision ] 🧑🏻‍🔬

🧪Model : prithivMLmods/ChemQwen-vL

📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/

📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju

Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/

🤗: @prithivMLmods

1 reply

·

Shaldon

authored 6 papers 4 days ago

not-lain

posted an update 7 days ago

Post

3662

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

prithivMLmods

posted an update 10 days ago

Post

3120

200+ f{🤗} on Stranger Zone! [ https://huggingface.co/strangerzonehf ]

❤️‍🔥Stranger Zone's MidJourney Mix Model Adapter is trending on the Very Model Page, with over 45,000+ downloads. Additionally, the Super Realism Model Adapter has over 52,000+ downloads, remains the top two adapter on Stranger Zone!
strangerzonehf/Flux-Midjourney-Mix2-LoRA, strangerzonehf/Flux-Super-Realism-LoRA

👽Try Demo: prithivMLmods/FLUX-LoRA-DLC

📦Most Recent Adapters to Check Out :
+ Ctoon : strangerzonehf/Ctoon-Plus-Plus
+ Cardboard : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Claude Art : strangerzonehf/Flux-Claude-Art
+ Flay Lay : strangerzonehf/Flux-FlatLay-LoRA
+ Smiley Portrait : strangerzonehf/Flux-Smiley-Portrait-LoRA

🤗Thanks for Community & OPEN SOURCEEE !!

6 replies

·

prithivMLmods

posted an update 13 days ago

Post

5860

Reasoning SmolLM2 🚀

🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M

prithivMLmods

posted an update 19 days ago

Post

3850

Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF

4 replies

·

ginipick

posted an update 20 days ago

Post

4118

🌊 [Dokdo Membership - Next Generation AI Video Creation Platform]

✨ Transform your imagination into mesmerizing videos with Dokdo Membership, an innovative AI-powered platform that generates unique videos from text and images. Built as a streamlined SaaS boilerplate using Python Gradio for Hugging Face users, this tool offers an intuitive way to create AI-generated videos with minimal effort.

🎯 [Key Features]
- 📧 Email-based authentication system with secure login/signup
- 🎁 15 points automatically credited upon registration
- 💰 5 points deduction per video generation
- 🌏 Bilingual support (Korean/English) with automatic translation
- 🖼️ Optional first frame image upload capability
- ⭐ Automatic GiniGEN.AI watermark integration

🚀 [Technical Specifications]
1. 💫 Modern, responsive user interface with Gradio components
2. 📊 Efficient resource management through points system
3. 🎥 High-quality video generation using advanced AI models
4. 🔄 Seamless translation pipeline for multilingual support
5. ⚡ Real-time point tracking and management system
6. 🛡️ Comprehensive content moderation and filtering

📝 [How to Use]
1. ✅ Register with your email to receive 15 initial points
2. 💭 Enter your video description (supports both English and Korean)
3. 📤 Upload a reference image for the first frame (optional)
4. 🎬 Click "Generate Video" (consumes 5 points)
5. 📥 Preview and download your generated video

🔧 [Technical Implementation]
- Built with Python Gradio for seamless Hugging Face Space integration
- Implements secure user authentication and session management
- Features real-time point tracking and automated deduction system
- Includes comprehensive error handling and input validation
- Utilizes advanced AI models for video generation

📮 Need additional points for more creations? Contact us at [email protected] for point acquisition options through public contributions or paid services.

ginigen/Dokdo-membership

1 reply

·

ginipick

posted an update 24 days ago

Post

5191

🎬 Revolutionize Your Video Creation
Dokdo Multimodal AI Transform a single image into a stunning video with perfect audio harmony! 🚀

Superior Technology 💫
Advanced Flow Matching: Smoother video transitions surpassing Kling and Sora
Intelligent Sound System: Automatically generates perfect audio by analyzing video mood
Multimodal Framework: Advanced AI integrating image, text, and audio analysis
Outstanding Performance 🎯
Ultra-High Resolution: 4K video quality with bfloat16 acceleration
Real-Time Optimization: 3x faster processing with PyTorch GPU acceleration
Smart Sound Matching: Real-time audio effects based on scene transitions and motion
Exceptional Features ✨
Custom Audio Creation: Natural soundtrack matching video tempo and rhythm
Intelligent Watermarking: Adaptive watermark adjusting to video characteristics
Multilingual Support: Precise translation engine powered by Helsinki-NLP
Versatile Applications 🌟
Social Media Marketing: Create engaging shorts for Instagram and YouTube
Product Promotion: Dynamic promotional videos highlighting product features
Educational Content: Interactive learning materials with enhanced engagement
Portfolio Enhancement: Professional-grade videos showcasing your work
Experience the video revolution with Dokdo Multimodal, where anyone can create professional-quality content from a single image. Elevate your content with perfectly synchronized video and audio that captivates your audience! 🎨

Start creating stunning videos that stand out from the crowd - whether you're a marketer, educator, content creator, or business owner. Join the future of AI-powered video creation today!

ginipick/Dokdo-multimodal

#VideoInnovation #AITechnology #PremiumContent #MarketingSolution

🔊 Please turn on your sound for the best viewing experience!

1 reply

·

ginipick

posted an update 26 days ago

Post

3578

🎨 GiniGen Canvas-o3: Intelligent AI-Powered Image Editing Platform
Transform your images with precision using our next-generation tool that lets you extract anything from text to objects with simple natural language commands! 🚀
📌 Key Differentiators:

Intelligent Object Recognition & Extraction
• Freedom to select any target (text, logos, objects)
• Simple extraction via natural language commands ("dog", "signboard", "text")
• Ultra-precise segmentation powered by GroundingDINO + SAM
Advanced Background Processing
• AI-generated custom backgrounds for extracted objects
• Intuitive object size/position adjustment
• Multiple aspect ratio support (1:1, 16:9, 9:16, 4:3)
Progressive Text Integration
• Dual text placement: over or behind images
• Multi-language font support
• Real-time font style/size/color/opacity adjustment

🎯 Use Cases:

Extract logos from product images
Isolate text from signboards
Select specific objects from scenes
Combine extracted objects with new backgrounds
Layer text in front of or behind images

💫 Technical Features:

Natural language-based object detection
Real-time image processing
GPU acceleration & memory optimization
User-friendly interface

🎉 Key Benefits:

User Simplicity: Natural language commands for object extraction
High Precision: AI-powered accurate object recognition
Versatility: From basic editing to advanced content creation
Real-Time Processing: Instant result visualization

Experience the new paradigm of image editing with GiniGen Canvas-o3:

Seamless integration of multiple editing functions
Professional-grade results with consumer-grade ease
Perfect for social media, e-commerce, and design professionals

Whether you're extracting text from complex backgrounds or creating sophisticated visual content, GiniGen Canvas-o3 provides the precision and flexibility you need for modern image editing!

GO! ginigen/CANVAS-o3

2 replies

·

ginipick

updated a model 26 days ago

Tune-A-Video-library/chicken-1

Text-to-Image • Updated 26 days ago • 30

ehristoforu

posted an update 28 days ago

Post

3046

✒️ Ultraset - all-in-one dataset for SFT training in Alpaca format.
fluently-sets/ultraset

❓ Ultraset is a comprehensive dataset for training Large Language Models (LLMs) using the SFT (instruction-based Fine-Tuning) method. This dataset consists of over 785 thousand entries in eight languages, including English, Russian, French, Italian, Spanish, German, Chinese, and Korean.

🤯 Ultraset solves the problem faced by users when selecting an appropriate dataset for LLM training. It combines various types of data required to enhance the model's skills in areas such as text writing and editing, mathematics, coding, biology, medicine, finance, and multilingualism.

🤗 For effective use of the dataset, it is recommended to utilize only the "instruction," "input," and "output" columns and train the model for 1-3 epochs. The dataset does not include DPO or Instruct data, making it suitable for training various types of LLM models.

❇️ Ultraset is an excellent tool to improve your language model's skills in diverse knowledge areas.

prithivMLmods

posted an update 28 days ago

Post

6402

Sketchify 😉🎨

+ strangerzonehf/Flux-Sketch-Smudge-LoRA
+ strangerzonehf/Flux-Sketch-Sized-LoRA
+ strangerzonehf/Sketch-Paint

- strangerzonehf/sketch-fav-675ba869c7ceaec7e652ee1c

ginipick

posted an update about 1 month ago

Post

4321

🌟 Digital Odyssey: AI Image & Video Generation Platform 🎨
Welcome to our all-in-one AI platform for image and video generation! 🚀
✨ Key Features

🎨 High-quality image generation from text
🎥 Video creation from still images
🌐 Multi-language support with automatic translation
🛠️ Advanced customization options

💫 Unique Advantages

⚡ Fast and accurate results using FLUX.1-dev and Hyper-SD models
🔒 Robust content safety filtering system
🎯 Intuitive user interface
🛠️ Extended toolkit including image upscaling and logo generation

🎮 How to Use

Enter your image or video description
Adjust settings as needed
Click generate
Save and share your results automatically

🔧 Tech Stack

FluxPipeline
Gradio
PyTorch
OpenCV

link: ginigen/Dokdo

Turn your imagination into reality with AI! ✨
#AI #ImageGeneration #VideoGeneration #MachineLearning #CreativeTech

7 replies

·

akhaliq

posted an update about 1 month ago

Post

5509

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: akhaliq/anychat

1 reply

·

Tune a video concepts library

AI & ML interests

Recent Activity

Tune-A-Video-library's activity

Video-P2P: Video Editing with Cross-attention Control

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

RL-GPT: Integrating Reinforcement Learning and Code-as-policy

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Multi-modal Cooking Workflow Construction for Food Recipes

Generative Video Propagation

Tune-A-Video-library/chicken-1

AI & ML interests

Recent Activity

Team members 83

Tune-A-Video-library's activity