Get startedTutorials

Tutorial - Information Enrichment & Omni Search

Introduction

Building upon our Property Search & Review System, this tutorial focuses on leveraging semi-structured data to create a more flexible and powerful rental platform. We'll explore how JSONB enables dynamic property attributes and sophisticated search capabilities that adapt to evolving business needs.

Modern rental platforms must handle diverse property types with varying attributes. Traditional relational schemas become rigid when dealing with this variety, but Tacnode's JSONB support provides the flexibility to store, query, and analyze complex property data efficiently.

The Power of Semi-Structured Data

Why JSONB Matters for Rental Platforms

1. Dynamic Property Attributes Every property is unique. A beachfront villa has different amenities than a city apartment. JSONB allows each property to have its own set of attributes without requiring schema changes.

2. Evolving Business Requirements As your platform grows, new property types and amenities emerge. JSONB enables you to add new attributes instantly without database migrations.

3. Complex Search Capabilities Users want to search by combinations of amenities, host verification methods, and property features. JSONB queries make these complex searches efficient and intuitive.

4. Rich Analytics Business intelligence requires analyzing relationships between property attributes, guest preferences, and booking patterns. JSONB facilitates this analysis.

Data Model Enhancement

Let's enhance our existing listings table with powerful JSONB capabilities:

Adding Host Verification Data

Our platform needs to track various host verification methods to build trust. Let's convert the existing text-based verification data to structured JSONB:

-- Add new JSONB column for structured host verifications
ALTER TABLE listings
ADD COLUMN host_verifications_jsonb JSONB;
 
-- Convert existing text data to proper JSON format
UPDATE listings
SET host_verifications_jsonb = jsonb(
    REPLACE(REPLACE(host_verifications, '''', '"'), ' ', '')
);

This transformation converts text like 'phone, work_email' into proper JSON arrays like ["phone", "work_email"], enabling powerful querying capabilities.

Understanding Our Enhanced Data Model

With these enhancements, our platform can now handle:

  • Flexible Amenity Storage: Each property can have unique amenities stored as JSON arrays
  • Structured Host Verification: Verification methods are stored as queryable JSON data
  • Dynamic Attribute Addition: New property features can be added without schema changes
  • Complex Relationship Queries: We can analyze correlations between different property attributes

1. Dynamic Amenity Filtering

Scenario: A user wants to find properties with specific amenities like "WiFi" and "Kitchen"

SELECT 
    l.id,
    l.name,
    l.description,
    l.neighbourhood,
    l.neighbourhood_cleansed
FROM listings l
WHERE 
    l.amenities @> '["Wifi", "Kitchen"]'
    AND l.has_availability = TRUE;

The @> operator checks if the left JSON contains all elements from the right JSON, making it perfect for amenity filtering.

2. Quality-Based Search with Amenity Requirements

Scenario: Users seek high-rated properties (4.5+ stars) with specific amenities

SELECT 
    l.id,
    l.name,
    l.review_scores_rating,
    l.amenities
FROM listings l
WHERE 
    l.review_scores_rating > 4.5 
    AND l.amenities @> '["Elevator", "Pets allowed"]';

This query combines structured data (ratings) with semi-structured data (amenities) for comprehensive filtering.

3. Host Trust and Verification Filtering

Scenario: Security-conscious users prefer verified hosts with specific verification methods

SELECT 
    l.id,
    l.name,
    l.host_name,
    l.host_verifications_jsonb
FROM listings l
WHERE 
    l.host_verifications_jsonb @> '["work_email"]'
    AND l.host_identity_verified = TRUE;

This helps users find properties from hosts who have completed comprehensive verification processes.

Scenario: Users want properties meeting multiple complex criteria simultaneously

SELECT 
    l.id,
    l.name,
    l.price,
    l.room_type,
    l.neighbourhood,
    l.review_scores_rating
FROM listings l
JOIN calendar c ON l.id = c.listing_id
WHERE 
    -- Availability requirement
    c.available = TRUE
    AND c.date > '2024-01-01'
    
    -- Price range filter
    AND l.price BETWEEN 50::MONEY AND 150::MONEY
    
    -- Property type preference
    AND l.room_type = 'Entire home/apt'
    
    -- Essential amenity requirement
    AND l.amenities @> '["Air conditioning"]'
    
    -- Quality assurance
    AND l.review_scores_rating > 4.0
    
GROUP BY l.id, l.name, l.price, l.room_type, l.neighbourhood, l.review_scores_rating
ORDER BY l.review_scores_rating DESC;

Advanced Analytics with JSONB

1. Amenity Popularity Analysis

Business Question: Which amenities are most common across our platform?

SELECT 
    amenity,
    COUNT(*) as property_count,
    ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM listings), 2) as percentage
FROM (
    SELECT jsonb_array_elements_text(amenities) as amenity
    FROM listings
    WHERE amenities IS NOT NULL
) amenity_breakdown
GROUP BY amenity
ORDER BY property_count DESC
LIMIT 20;

2. Host Verification Impact Analysis

Business Question: Do more verified hosts receive better reviews?

SELECT 
    verification_count,
    AVG(review_scores_rating) as avg_rating,
    COUNT(*) as host_count
FROM (
    SELECT 
        host_id,
        jsonb_array_length(host_verifications_jsonb) as verification_count,
        AVG(review_scores_rating) as review_scores_rating
    FROM listings
    WHERE host_verifications_jsonb IS NOT NULL
    GROUP BY host_id, host_verifications_jsonb
) host_analysis
GROUP BY verification_count
ORDER BY verification_count;

3. Review Sentiment Analysis with Amenity Correlation

Business Question: Which amenities are mentioned most positively in reviews?

SELECT 
    amenity,
    COUNT(*) as mention_count,
    AVG(l.review_scores_rating) as avg_property_rating
FROM (
    SELECT 
        r.listing_id,
        jsonb_array_elements_text(l.amenities) as amenity
    FROM reviews r
    JOIN listings l ON r.listing_id = l.id
    WHERE r.comments ILIKE '%clean%' 
       OR r.comments ILIKE '%excellent%'
       OR r.comments ILIKE '%amazing%'
) positive_mentions
JOIN listings l ON positive_mentions.listing_id = l.id
GROUP BY amenity
HAVING COUNT(*) > 5
ORDER BY avg_property_rating DESC, mention_count DESC;

Performance Optimization for JSONB Queries

1. GIN Indexes for Fast JSONB Queries

-- Index for amenity searches
CREATE INDEX idx_listings_amenities_gin ON listings USING GIN(amenities);
 
-- Index for host verification searches  
CREATE INDEX idx_listings_host_verifications_gin ON listings USING GIN(host_verifications_jsonb);
 
-- Composite index for common query patterns
CREATE INDEX idx_listings_availability_amenities ON listings(has_availability) 
INCLUDE (amenities, review_scores_rating);

2. Optimized Query Patterns

Efficient Amenity Checking:

-- Fast: Use @> for containment
WHERE amenities @> '["Wifi"]'
 
-- Slower: Use ? for existence (when you need just one item)
WHERE amenities ? 'Wifi'

Complex Amenity Logic:

-- Multiple amenity requirements (AND logic)
WHERE amenities @> '["Wifi", "Kitchen", "Air conditioning"]'
 
-- Alternative amenity options (OR logic)  
WHERE amenities ?| array['Pool', 'Hot tub', 'Sauna']

Real-World Query Examples

Guest Preference Analysis

Find properties that satisfy family travelers:

SELECT 
    l.id,
    l.name,
    l.accommodates,
    l.review_scores_rating
FROM listings l
WHERE 
    -- Family-friendly amenities
    l.amenities @> '["Wifi", "Kitchen", "Washer"]'
    
    -- Child-safe features
    AND (l.amenities ? 'Crib' OR l.amenities ? 'High chair')
    
    -- Sufficient capacity
    AND l.accommodates >= 4
    
    -- Quality assurance
    AND l.review_scores_rating >= 4.0
    
ORDER BY l.review_scores_rating DESC;

Find properties optimized for business travelers:

SELECT 
    l.id,
    l.name,
    l.neighbourhood,
    l.price
FROM listings l
WHERE 
    -- Business amenities
    l.amenities @> '["Wifi", "Dedicated workspace"]'
    
    -- Professional host verification
    AND l.host_verifications_jsonb @> '["work_email", "phone"]'
    
    -- Instant booking for convenience
    AND l.instant_bookable = TRUE
    
    -- Central location
    AND l.neighbourhood_cleansed IN ('Central', 'Sheung Wan', 'Admiralty')
    
ORDER BY l.price ASC;

Find premium properties with luxury amenities:

SELECT 
    l.id,
    l.name,
    l.price,
    l.property_type
FROM listings l
WHERE 
    -- Luxury amenities
    l.amenities ?| array['Hot tub', 'Pool', 'Gym', 'Concierge']
    
    -- Premium property types
    AND l.property_type IN ('Villa', 'Townhouse', 'Condominium')
    
    -- High-end price range
    AND l.price >= '$200'
    
    -- Excellent ratings
    AND l.review_scores_rating >= 4.5
    
ORDER BY l.price DESC;

Integration with Application Layer

Using MyBatis Plus for JSONB Operations

Here's how you might implement these queries in a Java application:

@Mapper
public interface ListingMapper extends BaseMapper<Listing> {
    
    @Select("SELECT * FROM listings WHERE amenities @> #{amenities} AND has_availability = true")
    List<Listing> findByAmenities(@Param("amenities") String amenities);
    
    @Select("SELECT * FROM listings WHERE host_verifications_jsonb @> #{verifications} AND host_identity_verified = true")
    List<Listing> findByHostVerification(@Param("verifications") String verifications);
    
    // Complex multi-criteria search
    List<Listing> findByCriteria(@Param("criteria") SearchCriteria criteria);
}

Key Benefits Achieved

Through this tutorial, we've demonstrated how JSONB enables:

1. Unprecedented Flexibility

  • No schema changes needed for new property attributes
  • Each property can have unique characteristics
  • Easy adaptation to changing business requirements

2. Powerful Query Capabilities

  • Complex multi-attribute filtering
  • Efficient containment and existence checks
  • Rich analytical queries for business intelligence

3. Excellent Performance

  • GIN indexes make JSONB queries fast
  • Combines relational and document database benefits
  • Scales efficiently with large datasets

4. Enhanced User Experience

  • More precise search results
  • Dynamic filtering options
  • Personalized property recommendations

Conclusion

By integrating JSONB into our rental platform, we've created a system that combines the reliability of relational databases with the flexibility of document stores. This approach enables:

  • Dynamic Property Management: Easily accommodate diverse property types and evolving amenity lists
  • Sophisticated Search: Enable users to find exactly what they're looking for through multi-dimensional filtering
  • Business Intelligence: Extract insights from complex property and user behavior data
  • Future-Proof Architecture: Adapt quickly to new business requirements without database migrations

In our next tutorial, Housing Data Analytics, we'll build upon this foundation to create comprehensive analytics dashboards that provide actionable business insights from our rental platform data.