Releases: Marker-Inc-Korea/AutoRAG
v0.3.7
What's Changed
- fix the error and release 0.3.5-rc1 by @vkehfdl1 in #842
- Add Huggingface Space at README.md by @bwook00 in #847
- Add new Sample YAML file by @bwook00 in #848
- Fix README.md by @Jake-Song in #850
- Add AWS Bedrock llm and upgrade VERSION 0.3.6 by @bwook00 in #856
- Add roadmap and other badges at README.md by @vkehfdl1 in #862
- Add use multimodal feature at llama parse by @bwook00 in #868
- ✨ feat: Update supporting nodes and modules information in index.md by @hongsw in #859
- Add External VectorDB Connections by @vkehfdl1 in #872
- Release/v0.3.7 by @vkehfdl1 in #883
New Contributors
- @Jake-Song made their first contribution in #850
Full Changelog: v0.3.5...v0.3.7
v0.3.5
What's Changed
- Run validation at the start_trial by @vkehfdl1 in #826
- AutoRAG api version & api docker container + gpu version docker container by @vkehfdl1 in #823
- Add FlashRank Reranker module by @bwook00 in #818
- set the fixed port number of the panel dashboard by @vkehfdl1 in #827
- change stream to astream, and add non-async stream function by @vkehfdl1 in #835
- add setup python at sphinx.yml by @vkehfdl1 in #836
- Change recency filter parameter name to threshold_datetime from threshold by @vkehfdl1 in #837
- Release/v0.3.5 by @vkehfdl1 in #838
- [Hotfix] name change Konlpy at chunk_full.yaml by @bwook00 in #840
Full Changelog: v0.3.4...v0.3.5
v0.3.4
What's Changed
- Add OpenVINO Reranker module by @bwook00 in #808
- Properly truncate to 8000 tokens when we use OpenAI Embeddings by @vkehfdl1 in #812
- Refactor API server with streaming and passage return by @vkehfdl1 in #810
- ✨ feat: Added Docker push workflow, Dockerfile updates, and build script by @hongsw in #807
- Add VoyageAI Reranker module by @bwook00 in #809
- calculate the right cosine similarity score at the get_id_scores by @vkehfdl1 in #816
- 日本語対応 by @wooheum-xin in #814
- Add Mixedbread AI Reranker Module by @bwook00 in #805
- Release/v0.3.4 by @vkehfdl1 in #813
New Contributors
- @wooheum-xin made their first contribution in #814
Full Changelog: v0.3.3...v0.3.4
v0.3.3
What's Changed
- [Parse Bug] Fix only parse the first page of the whole pdf files by @bwook00 in #783
- [Parse Bug] Add non-table exists page to use clova.py by @bwook00 in #784
- Prevent error that httpx uses different event loop at method chaining on the QA by @vkehfdl1 in #785
- add deepeval metrics by @Eastsidegunn in #750
- Release/v0.3.3 by @vkehfdl1 in #803
Full Changelog: v0.3.2...v0.3.3
v0.3.2
v0.3.1
What's Changed
- Add toctree by @bwook00 in #745
- Fix minor errors at the documentations by @vkehfdl1 in #747
- add effective_order at bleu as True by @vkehfdl1 in #748
- add passage dependency filter at data creation by @vkehfdl1 in #751
- Add Passage Dependency at README.md by @bwook00 in #761
- docs: update data_format.md by @eltociear in #772
- change the README and tutorial of deploying the result. by @vkehfdl1 in #769
- Windows support (partially) AutoRAG by @vkehfdl1 in #766
- Feature/hongsw/671 dockerfile Add Dockerfile and Docker configuration for AutoRAG production environment by @hongsw in #763
- Add total three evolving methods to QA creation by @vkehfdl1 in #767
- Possible error when the QA retrieval_gt shape will be different by @vkehfdl1 in #774
- dump version 0.3.1 by @vkehfdl1 in #776
New Contributors
- @eltociear made their first contribution in #772
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Refactoring to v3.0 for efficient deployment by @vkehfdl1 in #727
- resolve vllm error by @vkehfdl1 in #735
- Change data creation package names to v0.3 by @vkehfdl1 in #740
- Add more yaml file by @bwook00 in #743
- Update README for v 0.3.0 by @bwook00 in #739
- dump version 0.3.0 by @vkehfdl1 in #741
Full Changelog: v0.2.18...v0.3.0
🚀 AutoRAG v0.3.0 is Here! 🚀
We're thrilled to introduce AutoRAG v0.3.0, packed with new features and key improvements. Here’s what’s new:
1. Improved Response Time for Deployment
In earlier versions, the response time during deployment was slow, making it difficult to use optimized RAG pipeline. With v0.3.0, we've significantly reduced the response time, making deployment much more efficient for user-facing services.
2. Re-designed Data Creation Process
Data creation is an essential part of optimizing RAG pipelines, and we've made the process much smoother. In earlier versions, this feature was still in its early stages. Now, in v0.3.0, you can build the data creation process within AutoRAG.
We’ve added AutoParse and AutoChunk, allowing you to configure, parse, and chunk your data using a single YAML file. You can also easily compare different methods to refine your pipeline. Whether you build QA datasets with LLMs or manually, this structure offers a human-in-the-loop process to help you create and manage your data.
Check out the detailed guide on data creation.
3. Python & Library Support Updates
- Python 3.9 is no longer supported. Please upgrade to Python 3.10.
- AutoRAG now works with LangChain 0.3, LlamaIndex 0.11, Pydantic v2, and OpenAI o1 models.
Share Your Feedback
Your insights help us improve AutoRAG! Let us know how these updates impact your workflow and what you’d like to see in future versions.
Join Discord server now!
Thank you for being part of the AutoRAG journey!
v0.2.18
What's Changed
- change add_file_name language notation by @bwook00 in #717
- Ingest bm25_tokenizer and embedidng only in the strategy of other modules by @vkehfdl1 in #716
- OpenAI o1 model compatibility by @vkehfdl1 in #719
- Compatible with Langchain version 0.3.0 by @bwook00 in #724
- Release/v0.2.18 by @vkehfdl1 in #726
Full Changelog: v0.2.17...v0.2.18
v0.2.17
What's Changed
- Add update corpus feature for chunking optimization by @vkehfdl1 in #706
- Add func annotation about parse module by @bwook00 in #708
- Add baseline beta docs by @bwook00 in #710
- Finish new data creation documentation by @vkehfdl1 in #711
- Finish Chunk and Parse documentation by @bwook00 in #712
- fix vectordb score bug by @vkehfdl1 in #713
Full Changelog: v0.2.16...v0.2.17
v0.2.16
What's Changed
- Replace FastAPI with Flask by @rjwharry in #657
- Mock all OpenAI Embeddings at the test code for outside contributors by @vkehfdl1 in #659
- Add basic dataset schema for new 'beta' version of data creation by @vkehfdl1 in #663
- Add AutoParse baseline and module 'langchain_parse' and 'clova' by @bwook00 in #660
- Add llamaparse module by @bwook00 in #666
- replace yaml.dump with yaml.safe_dump by @rjwharry in #669
- Add table hybrid parse module by @bwook00 in #668
- [Data Creation Refactoring] Add generate qa set features by @vkehfdl1 in #678
- Add more data creation methods by @vkehfdl1 in #680
- add (auto)chunk and its first module llama_index_chunk by @bwook00 in #681
- [Data Creation Refactoring] Add don't know filter at data creation and its docs by @vkehfdl1 in #686
- [Chunk] Add "path" and "start_end_idx" at chunk return by @bwook00 in #685
- add override at Raw and Chunker from_parquet classmethod by @vkehfdl1 in #692
- [Chunk] Add langchain chunk module by @bwook00 in #693
- Fix bug when use vllm in multi-gpu environment by @vkehfdl1 in #697
- Add chunk method at Raw schema and test whole pipeline to generate initial dataset. by @vkehfdl1 in #698
- fix an issue with loading HuggingfaceLLM models by @jis478 in #652
- [Bug] Modify to kiwipiepy version 0.18.0 or higher by @bwook00 in #704
- refactor existing metric python files with input schema by @Eastsidegunn in #667
- dump version 0.2.16 by @vkehfdl1 in #705
New Contributors
Full Changelog: v0.2.15...v0.2.16