Book Review

Book Review: Building Data-Driven Applications with LlamaIndex

Learn how to build LLM-based applications with LlamaIndex

Andrew Lukyanenko
3 min readAug 26, 2024

I was offered to read this book in exchange for an honest review.

https://www.amazon.ae/Building-Data-Driven-Applications-LlamaIndex-retrieval-augmented/dp/183508950X

This is a great book that will be useful to anyone who wants to improve their understanding of the LlamaIndex library or working with LLMs in general. It is suitable for both beginners and those with experience working with LLMs who want to use LlamaIndex.

What I liked:

  • The introduction to LLMs is well-written and comprehensive, with thorough descriptions of their limitations and downsides.
  • Having a practical project that is created and updated throughout the book is an excellent approach to learning. It provides hands-on experience and helps readers understand how concepts apply in real-world scenarios.
  • The technical instructions for the project setup are clear and easy to follow, which is essential for readers trying to replicate the examples.
  • I like how the author describes terms and concepts in layman’s terms, using clarifying examples.
  • The book offers practical advice on many things, including data cleaning and hyperparameter tuning.
  • I support the author’s approach of using pre-made libraries and tools. While it could be fun to write everything from scratch, it would be out of the scope of the book. In reality, people usually use existing solutions instead of writing everything themselves — this way, we save time and avoid potential bugs.
  • Given the Llama Index library’s size, I’m glad that the author provides numerous examples of its API. This helps readers understand the library’s capabilities and how to use them effectively.
  • The description of various indexation approaches is thorough and informative.
  • The sections on potential cost estimation are particularly practical — this is quite important in real-world scenarios. The suggestions on cost reduction are also valuable for those working with limited budgets.
  • Just when I thought that it would be better to apply asynchronous operations to the retrieval process, I found a section about it. This demonstrates the author’s foresight in addressing common optimization techniques.
  • I particularly enjoyed the chatbot & agents section. This is one of the most popular use cases for LLMs. Having previously built a chatbot using Langchain, I was interested in reading how it would be implemented using LlamaIndex. While using OpenAIAgent could be considered too high-level for some, the author balances this by later describing agent runners and agent workers to work with data at a lower level.
  • Tracing is rightly emphasized as an important part of any system. The author correctly points out that logging every action and metadata is crucial for analysis, debugging, and solution improvement.

Some thoughts on what could be added (though I understand this could be out of the scope):

  • While the book is comprehensive, I wish there was a comparison with Langchain, as it would have provided valuable context for those familiar with other frameworks.
  • While the section on reranking was informative, I wish it were longer and provided more in-depth coverage of this important topic.

The book can be read in two ways: either completely or by acquainting oneself with high-level topics and returning for specific implementations as needed.

Overall, this book provides a comprehensive guide to building data-driven applications with LlamaIndex, balancing theoretical knowledge with practical implementation details.

--

--