AI & ML interests

The next generation of large language models focuses on optimization for excellent reasoning, multi-task knowledge, and multilingual.

ghost-x's activity

lamhieuย 
posted an update 2 days ago
view post
Post
1363
Power Up RAG, Virtual Assistants, and Perplexity Alternatives! ๐Ÿš€

๐Ÿ”— Docsifer + Lightweight Embeddings API = The perfect duo for next-gen solutions!

- ๐Ÿ“„ Docsifer: Seamlessly convert PDFs, Word, JSON, and URLs to Markdownโ€”ideal for building clean, structured knowledge bases.
- โœจ Lightweight Embeddings API: Create multilingual and multimodal embeddings for fast, accurate search, reranking, and understanding.

๐Ÿค– Build smarter RAG pipelines, enhance virtual assistants, or craft powerful Perplexity-like applications with this free, production-ready combo.

๐Ÿ‘‰ Start optimizing today:
- Docsifer: lamhieu/docsifer
- Lightweight Embeddings API: lamhieu/lightweight-embeddings

๐Ÿ’ก Faster insights. Better recommendations. Global reach. ๐Ÿš€
  • 1 reply
ยท
lamhieuย 
posted an update 5 days ago
view post
Post
512
๐Ÿš€ Docsifer: Convert Anything to Markdown! ๐Ÿ“

Transform your files into Markdown with Docsiferโ€”your all-in-one tool for diverse formats like PDF, Word, Excel, JSON, HTML, CSV, ZIP, and even audio/images. Supports URL-to-Markdown too! ๐Ÿ”—โœจ

๐ŸŒŸ Why Docsifer?
- Multi-Format: Convert virtually any document type.
- Flexible & Accurate: Powered by MarkItDown and optional LLMs for advanced text extraction.
- Privacy-First: No data storageโ€”only minimal anonymous stats.
- Open Source: Transparent and community-driven.
- Production-Ready: Docker, API, and interactive playground on Hugging Face Spaces.

๐Ÿ‘‰ Try it out or contribute:
๐ŸŒ Hugging Face: lamhieu/docsifer
๐Ÿ’ป GitHub: https://github.com/lh0x00/docsifer

Convert smarter. Collaborate better. Start now! ๐Ÿš€
lamhieuย 
posted an update 7 days ago
view post
Post
1849
Unlock seamless document conversion with Docsifer, powered by MarkItDown at its core! ๐Ÿš€ Effortlessly transform PDFs, Word, Excel, images, audio, HTML, and more into clean, structured Markdownโ€”perfect for developers, writers, and content creators. With optional LLM-enhanced extraction and robust format support, Docsifer ensures accuracy, speed, and privacy.
๐ŸŒŸ Try it now and experience professional-grade Markdown conversion: lamhieu/docsifer
lamhieuย 
posted an update 17 days ago
view post
Post
2199
๐Ÿš€ Unlock the power of a completely free, unlimited multilingual API!
๐ŸŒ The Lightweight Embeddings API offers state-of-the-art text and image embeddings, advanced reranking, and seamless support for over 100 languages โ€” no limits, no restrictions.
๐ŸŒŸ Try it now: lamhieu/lightweight-embeddings
lamhieuย 
posted an update 5 months ago
view post
Post
1759
๐ŸŽฏ Ghost 8B Beta 1608: Empowering Your AI Assistant
๐Ÿ“ฆ Unlock the Power of Ghost 8B Beta 1608: Build Your Personal AI Companion
Ghost 8B Beta 1608 empowers you to create a safe and multilingual AI assistant tailored to your needs, directly on your personal computer. ๐Ÿง‘โ€๐Ÿ’ป Leverage AI's capabilities within your own space! ๐Ÿš€ Ghost 8B Beta 1608 is ready to become your AI companion.
~
๐Ÿ“ฆ ๊ฐœ์ธ์šฉ AI ๋ณด์กฐ ๋„๊ตฌ๋กœ Ghost 8B Beta 1608๋ฅผ ํ™œ์šฉํ•˜์„ธ์š”!
Ghost 8B Beta 1608, AI์˜ ํž˜์„ ํ™œ์šฉํ•˜์—ฌ ์•ˆ์ „ํ•˜๊ณ  ๊ฐœ์ธํ™”๋œ ์–ธ์–ด ์ง€์›์„ ์ œ๊ณตํ•˜๋Š” AI ๋ณด์กฐ ๋„๊ตฌ๋ฅผ ์ง์ ‘ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿง‘โ€๐Ÿ’ป ๊ฐœ์ธ ์ปดํ“จํ„ฐ์—์„œ AI์˜ ํ˜œํƒ์„ ๋ˆ„๋ฆฌ์„ธ์š”! ๐Ÿš€ Ghost 8B Beta 1608๋Š” ๋‹น์‹ ์˜ AI ํŒŒํŠธ๋„ˆ๊ฐ€ ๋  ์ค€๋น„๊ฐ€ ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
lamhieu/ghost-8b-beta-8k
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
lamhieuย 
posted an update 5 months ago
view post
Post
3248
๐Ÿš€ Weโ€™re excited to launch Ghost 8B Beta (1608), a top-performing language model with unmatched multilingual support and cost efficiency.

Key Highlights:
- Superior Performance: Outperforms Llama 3.1 8B Instruct, GPT-3.5 Turbo, Claude 3 Opus, GPT-4, and more in winrate scores.
- Expanded Language Support: Now supports 16 languages, including English, Vietnamese, Spanish, Chinese, and more.
- Enhanced Capabilities: Improved math, reasoning, and instruction-following for better task handling.

With two context options (8k and 128k), Ghost 8B Beta is perfect for complex, multilingual applications, balancing power and cost-effectiveness.

๐Ÿ”— Learn More: https://ghost-x.org/docs/models/ghost-8b-beta
ghost-x/ghost-8b-beta-668ead6179f93be717db4542
lamhieuย 
updated a Space 5 months ago
lamhieuย 
posted an update 6 months ago
view post
Post
2107
๐ŸŽ‰ Ghost 8B Beta Released: Game-Changing Language Model
--
Ghost 8B Beta is a groundbreaking language model developed with a clear vision: to deliver exceptional multilingual support, superior knowledge capabilities, and all while remaining cost-effective. This model comes in two context length variations, 8k and 128k, ensuring flexibility for various tasks. Moreover, it boasts built-in multilingual functionality, making it a powerful tool for global communication and understanding.
--
* See detailed article: https://huggingface.co/blog/lamhieu/ghost-8b-beta-released-game-changing-language-mode
* Model card: ghost-x/ghost-8b-beta
* Official website: https://ghost-x.org/docs/models/ghost-8b-beta
lamhieuย 
posted an update 6 months ago
view post
Post
2123
๐Ÿคฏ Ghost 8B Beta emerges as a clear leader, surpassing even proprietary models like xAI Grok 1, OpenAI GPT 3.5, and Mistral Mixtral 8x7B. This dominance extends to its parity with Mistral Medium, further solidifying its position as a top-tier language model. Furthermore, Ghost 8B Beta stands out as one of only three models employing the zero-shot method for evaluation, alongside Claude 2 and Claude 3, showcasing its unique capabilities and potential for groundbreaking applications.
---
๐Ÿ’ฌ Chat with the model here:
- Playground with Ghost 8B Beta (ฮฒ, 8k): lamhieu/ghost-8b-beta-8k
- Playground with Ghost 8B Beta (ฮฒ, 128k): lamhieu/ghost-8b-beta-128k
- Official website: https://ghost-x.org/docs/models/ghost-8b-beta/
  • 2 replies
ยท
lamhieuย 
posted an update 6 months ago
view post
Post
4281
๐ŸŽ‰ The Ghost 8B Beta model outperforms prominent models such as Llama 3 8B Instruct, GPT 3.5 Turbo in the lc_winrate score. In addition, it also outperforms Claude 3 Opus, Claude 3 Sonnet, GPT-4, and Mistral Large when comparing the winrate score of AlpacaEval 2.0.

Ghost 8B Beta is a large language model developed with goals that include excellent multilingual support, superior knowledge capabilities, and cost-effectiveness. The model comes in two context length versions, 8k and 128k, along with multilingual function tools support by default.
The languages supported are ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean and ๐Ÿ‡จ๐Ÿ‡ณ Chinese.

Explore the Potential:
To learn more about this groundbreaking language model, visit the official website or explore the online demo platforms:
- Ghost 8B Beta (ฮฒ, 8k) on Spaces: lamhieu/ghost-8b-beta-8k.
- Ghost 8B Beta (ฮฒ, 128k) on Spaces: lamhieu/ghost-8b-beta-128k
- Official website: https://ghost-x.org/docs/models/ghost-8b-beta
ยท
lamhieuย 
posted an update 6 months ago
view post
Post
1510
Ghost 8B Beta is a large language model developed with goals that include excellent multilingual support, superior knowledge capabilities, and cost-effectiveness. The model comes in two context length versions, 8k and 128k, along with multilingual function tools support by default.
* The languages supported are ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean and ๐Ÿ‡จ๐Ÿ‡ณ Chinese.
* ๐Ÿ‘จโ€๐Ÿ’ป Try on Spaces: lamhieu/ghost-8b-beta-8k
* ๐Ÿ“‹ Official website: https://ghost-x.org/docs/models/ghost-8b-beta
  • 1 reply
ยท
lamhieuย 
posted an update 7 months ago
view post
Post
2912
Wow, this is amazing! ๐Ÿคฏ
Samba is a powerful hybrid model with an unlimited context length, combining Mamba, MLP, Sliding Window Attention, and MLP stacking. Samba largest version, Samba-3.8B, trained on 3.2 trillion tokens, excels in benchmarks like MMLU, GSM8K, and HumanEval, and shines in long-context tasks with minimal tuning.
---
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Github: https://github.com/microsoft/Samba
lamhieuย 
posted an update 8 months ago
view post
Post
1345
Haloooo, continue experimenting with a checkpoint version of Ghost Beta (small version) during training in stage 1 (trained progress: 41%).

Supported languages: ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡จ๐Ÿ‡ณ Chinese, and !?

Note that this is not a conclusion, this is just a sharing of the state of the model. If you find it interesting, please follow the project at:
* https://x.com/ghostx_ai
* https://ghost-x.org/
* https://huggingface.co/ghost-x

Ghost X is currently very open to invitations to cooperate, share and support.
๐Ÿคฏ๐Ÿ‘‡
  • 1 reply
ยท
lamhieuย 
posted an update 8 months ago
view post
Post
858
With the previous survey, Ghost Beta (small version) will support 9+ languages โ€‹โ€‹fluently. It is revealed that the model will be designed for 3 stages of training, showing a checkpoint to try at stage 1 (trained progress: 29%).

Supported languages: ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡จ๐Ÿ‡ณ Chinese, and !?

Note that this is not a conclusion, this is just a sharing of the state of the model. If you find it interesting, please follow the project at:
* https://x.com/ghostx_ai
* https://ghost-x.org/
* https://huggingface.co/ghost-x

๐Ÿคฏ๐Ÿ‘‡
lamhieuย 
posted an update 8 months ago
view post
Post
1399
๐ŸŽ‰ Happy to announce about the collection called "Blackhole". It is a black hole of high quality data in many fields, multilingual to train LLMs with SFT and DPO methods.
๐Ÿ“ฆ There are now over 30++ high-quality datasets available so you can start creating interesting models. It will be updated in the future, glad if it helps someone.

lamhieu/blackhole-66473b7feec034b4fb70818a