From Data to Product: Building Search, Visualization, and Real-Time Data Applications
After our previous efforts, we've successfully scraped a large amount of product data from e-commerce platforms like Taobao, and completed the processes of cleaning, deduplication, and storing the data. However, if the data merely sits in a database without being used, it cannot generate real value.
This article will guide you from being a “data provider” to leveraging the “data application layer.” By building APIs, integrating search engines, and designing front-end interfaces, you’ll enable your data to be searchable, analyzable, and interactive.
1. What is a Data API? Make Your Data Dynamic
A data API (Application Programming Interface) allows applications to query a database via HTTP requests. It typically provides:
RESTful-style endpoint design
Input parameter control (such as filters, pagination, sorting)
JSON format output for easy use by front-end or third-party systems
Example:
GET /api/products?q=headphones&price_min=100&price_max=500
2. Quickly Build Query APIs with FastAPI
FastAPI is a modern and high-performance Python web framework that’s perfect for building data-driven applications.
Install FastAPI and Uvicorn:
pip install fastapi uvicorn pymongo
Build a Product Query API (with MongoDB):
from fastapi import FastAPI, Queryfrom pymongo import MongoClient
from typing import List
app = FastAPI()
client = MongoClient("mongodb://localhost:27017")
collection = client.taobao.products
@app.get("/api/products")
def search_products(
q: str = Query(None),
price_min: float = Query(0),
price_max: float = Query(999999),
limit: int = 20
):
query = {
"price": {"$gte": price_min, "$lte": price_max}
}
if q:
query["title"] = {"$regex": q, "$options": "i"}
results = collection.find(query).limit(limit)
return [
{
"id": str(item["_id"]),
"title": item["title"],
"price": item["price"]
}
for item in results
]
Run the App:
uvicorn main:app --reload
3. Integrate Elasticsearch for Advanced Search: Chinese Support and Highlighting
If you have a large amount of data and require fuzzy search or Chinese text segmentation, Elasticsearch is a great backend engine.
Install and Configure a Chinese Tokenizer (e.g., jieba)
Add the following to elasticsearch.yml
:
index.analysis.analyzer.default.type: ik_max_word
FastAPI + Elasticsearch Search Example:
from elasticsearch import Elasticsearchfrom fastapi import FastAPI, Query
app = FastAPI()
es = Elasticsearch("http://localhost:9200")
@app.get("/api/search")
def search_product(q: str = Query(...)):
body = {
"query": {
"match": {
"title": {
"query": q,
"operator": "and"
}
}
},
"highlight": {
"fields": {
"title": {}
}
}
}
res = es.search(index="taobao", body=body)
return [
{
"id": hit["_id"],
"title": hit["highlight"]["title"][0],
"price": hit["_source"]["price"]
}
for hit in res["hits"]["hits"]
]
4. Build Visual Reports: Turn Data Into Charts
Using chart libraries like Chart.js or ECharts, you can transform backend data into visual charts via statistical APIs.
Price Distribution (FastAPI + MongoDB):
@app.get("/api/price_stats")def price_histogram():
pipeline = [
{"$bucket": {
"groupBy": "$price",
"boundaries": [0, 100, 300, 500, 1000, 5000],
"default": "5000+",
"output": {"count": {"$sum": 1}}
}}
]
data = list(collection.aggregate(pipeline))
return data
Frontend using Chart.js:
new Chart(ctx, {type: 'bar',
data: {
labels: ['0-100', '100-300', '300-500', '500-1000', '1000-5000', '5000+'],
datasets: [{
label: 'Product Count',
data: [50, 120, 90, 40, 25, 10]
}]
}
});
5. Real-Time Querying and Frontend Integration Tips
1. Integrate with Frontend (Vue / React)
fetch('/api/products?q=bluetooth+headphones&price_min=100').then(res => res.json())
.then(data => console.log(data))
2. CORS Handling:
from fastapi.middleware.cors import CORSMiddlewareapp.add_middleware(
CORSMiddleware,
allow_origins=["*"], # You can restrict to specific domains
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
3. API Rate Limiting:
You can implement basic rate limiting via Nginx or Redis.
6. Security and Performance Optimization
API Key Authentication: Protect sensitive data and prevent abuse
Redis Caching: Cache hot queries to reduce database load
Query Performance Monitoring: Mongo slow query logs / Elasticsearch timing metrics
Error Monitoring & Alerts: Use tools like Sentry or Prometheus
7. Extended Use Cases
Build a product search site (e.g., internal Taobao mini search engine)
Create analytical dashboards using Superset / Grafana
Package your API as a third-party service for developers
Conclusion: Data Is Not Just Stored — It’s Activated
In this article, we turned “static data” into “dynamic applications,” making data a core capability of your website. This completes the full cycle: scraping → cleaning → storing → querying and presenting. While the previous articles focused on acquiring and organizing data, this one focuses on activating data to create real value.
Next, you can:
Expand data sources (more sites or different domains)
Add advanced features (recommendation systems, user behavior analytics)
Build a full data platform (API suites for developers)
Articles related to APIs :
Introduction to Taobao API: Basic Concepts and Application Scenarios
Taobao API: Authentication & Request Flow Explained with Code Examples
Using the Taobao API to Retrieve Product Information and Implement Keyword Search
How to Use the Taobao API to Build a Product Price Tracker and Alert System
Using the Taobao API to Build a Category-Based Product Recommendation System
Taobao Data Source Analysis and Technical Selection: API vs Web Scraping vs Hybrid Crawling
If you need the Taobao API, feel free to contact us : support@luckdata.com