Write Readable Conditions with Polars + Extract Text with Docling
Plus convert PDFs to text with Docling
Grab your coffee. Here are this week’s highlights.
📅 Today’s Picks
Write Readable Multi-Condition Logic with Polars when-then-otherwise
Problem
pandas requires np.where() for simple conditions which breaks method chaining and becomes nested and hard to read for multiple conditions.
The apply() alternative is slow and also breaks the DataFrame workflow.
Solution
Polars provides when().then().otherwise() chains that integrate naturally with method chaining.
With pandas, nested np.where() calls stack up for each additional condition, creating deeply nested expressions. Polars replaces this with readable chains where each condition appears sequentially.
Key benefits:
Natural flow with method chaining
Each condition stands on its own line
No nested function calls
Maintains data transformation workflow
The pattern scales cleanly from two conditions to ten without sacrificing readability.
📖 View Full Article | 🧪 Run code
Extract Text from Any Document Format with Docling
Problem
Have you ever needed to pull text from PDFs, Word files, slide decks, or images for a project? Writing a different parser for each format is slow and error-prone.
Solution
Docling‘s DocumentConverter takes care of that by detecting the file type and applying the right parsing method for PDF, DOCX, PPTX, HTML, and images.
Other features of Docling:
AI-powered image descriptions for searchable diagrams
Export to pandas DataFrames, JSON, or Markdown
Structure-preserving output optimized for RAG pipelines
Built-in chunking strategies for vector databases
Parallel processing handles large document batches efficiently
📖 View Full Article | 🧪 Run code
☕️ Weekly Finds
lm-evaluation-harness [Machine Learning] - Unified framework for testing and evaluating generative language models across a wide range of benchmarks and tasks with support for local models and custom metrics
PyMC [Probabilistic Programming] - Probabilistic programming library for Python that allows users to build Bayesian models with a simple Python API and fit them using state-of-the-art methods
Quarkdown [Documentation] - Modern Markdown typesetting system with powerful extensions for creating books, articles, and presentations. Supports function calls, custom functions, and outputs HTML, PDF, and slides
Before You Go
🔍 Explore More on CodeCut
Tool Selector - Discover 70+ Python tools for AI and data science
Production Ready Data Science - A practical book for taking projects from prototype to production
💬 Rate Your Experience
How would you rate your newsletter experience? Share your feedback →


