From Data to Product: Building Search, Visualization, and Real-Time Data Applications

After our previous efforts, we've successfully scraped a large amount of product data from e-commerce platforms like Taobao, and completed the processes of cleaning, deduplication, and storing the data. However, if the data merely sits in a database without being used, it cannot generate real value.

This article will guide you from being a “data provider” to leveraging the “data application layer.” By building APIs, integrating search engines, and designing front-end interfaces, you’ll enable your data to be searchable, analyzable, and interactive.

1. What is a Data API? Make Your Data Dynamic

A data API (Application Programming Interface) allows applications to query a database via HTTP requests. It typically provides:

  • RESTful-style endpoint design

  • Input parameter control (such as filters, pagination, sorting)

  • JSON format output for easy use by front-end or third-party systems

Example:

GET /api/products?q=headphones&price_min=100&price_max=500

2. Quickly Build Query APIs with FastAPI

FastAPI is a modern and high-performance Python web framework that’s perfect for building data-driven applications.

Install FastAPI and Uvicorn:

pip install fastapi uvicorn pymongo

Build a Product Query API (with MongoDB):

from fastapi import FastAPI, Query

from pymongo import MongoClient

from typing import List

app = FastAPI()

client = MongoClient("mongodb://localhost:27017")

collection = client.taobao.products

@app.get("/api/products")

def search_products(

q: str = Query(None),

price_min: float = Query(0),

price_max: float = Query(999999),

limit: int = 20

):

query = {

"price": {"$gte": price_min, "$lte": price_max}

}

if q:

query["title"] = {"$regex": q, "$options": "i"}

results = collection.find(query).limit(limit)

return [

{

"id": str(item["_id"]),

"title": item["title"],

"price": item["price"]

}

for item in results

]

Run the App:

uvicorn main:app --reload

3. Integrate Elasticsearch for Advanced Search: Chinese Support and Highlighting

If you have a large amount of data and require fuzzy search or Chinese text segmentation, Elasticsearch is a great backend engine.

Install and Configure a Chinese Tokenizer (e.g., jieba)

Add the following to elasticsearch.yml:

index.analysis.analyzer.default.type: ik_max_word

FastAPI + Elasticsearch Search Example:

from elasticsearch import Elasticsearch

from fastapi import FastAPI, Query

app = FastAPI()

es = Elasticsearch("http://localhost:9200")

@app.get("/api/search")

def search_product(q: str = Query(...)):

body = {

"query": {

"match": {

"title": {

"query": q,

"operator": "and"

}

}

},

"highlight": {

"fields": {

"title": {}

}

}

}

res = es.search(index="taobao", body=body)

return [

{

"id": hit["_id"],

"title": hit["highlight"]["title"][0],

"price": hit["_source"]["price"]

}

for hit in res["hits"]["hits"]

]

4. Build Visual Reports: Turn Data Into Charts

Using chart libraries like Chart.js or ECharts, you can transform backend data into visual charts via statistical APIs.

Price Distribution (FastAPI + MongoDB):

@app.get("/api/price_stats")

def price_histogram():

pipeline = [

{"$bucket": {

"groupBy": "$price",

"boundaries": [0, 100, 300, 500, 1000, 5000],

"default": "5000+",

"output": {"count": {"$sum": 1}}

}}

]

data = list(collection.aggregate(pipeline))

return data

Frontend using Chart.js:

new Chart(ctx, {

type: 'bar',

data: {

labels: ['0-100', '100-300', '300-500', '500-1000', '1000-5000', '5000+'],

datasets: [{

label: 'Product Count',

data: [50, 120, 90, 40, 25, 10]

}]

}

});

5. Real-Time Querying and Frontend Integration Tips

1. Integrate with Frontend (Vue / React)

fetch('/api/products?q=bluetooth+headphones&price_min=100')

.then(res => res.json())

.then(data => console.log(data))

2. CORS Handling:

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(

CORSMiddleware,

allow_origins=["*"], # You can restrict to specific domains

allow_credentials=True,

allow_methods=["*"],

allow_headers=["*"],

)

3. API Rate Limiting:

You can implement basic rate limiting via Nginx or Redis.

6. Security and Performance Optimization

  • API Key Authentication: Protect sensitive data and prevent abuse

  • Redis Caching: Cache hot queries to reduce database load

  • Query Performance Monitoring: Mongo slow query logs / Elasticsearch timing metrics

  • Error Monitoring & Alerts: Use tools like Sentry or Prometheus

7. Extended Use Cases

  • Build a product search site (e.g., internal Taobao mini search engine)

  • Create analytical dashboards using Superset / Grafana

  • Package your API as a third-party service for developers

Conclusion: Data Is Not Just Stored — It’s Activated

In this article, we turned “static data” into “dynamic applications,” making data a core capability of your website. This completes the full cycle: scraping → cleaning → storing → querying and presenting. While the previous articles focused on acquiring and organizing data, this one focuses on activating data to create real value.

Next, you can:

  • Expand data sources (more sites or different domains)

  • Add advanced features (recommendation systems, user behavior analytics)

  • Build a full data platform (API suites for developers)

Articles related to APIs :

If you need the Taobao API, feel free to contact us : support@luckdata.com