I'm participating in the Mid Autumn Festival Creative submission competition , Details please see : Mid Autumn Festival Creative submission contest
Next week is the Mid Autumn Festival , I wish you all a happy mid autumn festival in advance .
Today we'll use JS Write a program to climb the front of Jingdong 100 Moon cake sales on page , See how much moon cakes you can sell every day at the end of the Mid Autumn Festival .
Data for reference only , Accuracy is not guaranteed .
Thank you for your praise , It's not easy to stay up late and write
The technology to be used
-
Oil monkey script (Tampermonkey)- Google browser plugin
-
JavaScript Native DOM operation
-
fetch request
-
asynchronous async await Time delay
-
express Create a data store API, Statistics API
-
node.js Read JSON file
-
Deploy to Tencent cloud Serverless service
Statistics show
Be careful 2021-9-7 The data of No. is mock data , In order to 2021-9-8 The no. thanLastDay Field can calculate data
Field description :
{
"date": "2021-9-8", // date
"total": "89026 Billion ", // Up to date date , Total sales
"thanLastDay": "7687 ten thousand " // Compared with the previous day , How much sales have been increased
}
Copy code
Now let's start the whole work
1. Install oil monkey script (Tampermonkey) plug-in unit
If you can surf the Internet scientifically, you can , Visit the official link below to install
chrome.google.com/webstore/de…
If you can't surf the Internet scientifically , Go to Baidu search Tampermonkey
, There will be many websites that provide local installation methods , I don't offer it here , To avoid infringement .
2. Write a script to crawl Jingdong moon cake data
After successful installation , In the upper right corner of the browser , Pictured
Enter Jingdong homepage first , Search for The moon cake
, Enter the list of products
Then click the management panel , Enter the script list page , Here you can open or close a script
then , Click on + Create a new script
I wrote a simple script here , You can paste it
// ==UserScript==
// @name jd The moon cake
// @namespace http://tampermonkey.net/
// @version 0.1
// @description Used to crawl 100 Product data on page
// @author Ovos
// @match https://search.jd.com/**
// @icon https://www.google.com/s2/favicons?domain=jd.com
// @grant none
// ==/UserScript==
(function() {
'use strict';
// Get sales quantity
function getNumber(str) {
if (str.includes(' ten thousand +')) {
return parseInt(str) * 10000
}
return parseInt(str)
}
// Wait function
function sleep(time) {
return new Promise((resolve, reject) => {
setTimeout(resolve, time * 1000)
})
}
async function main() {
// Wait for the first page data load
await sleep(3)
for (let i = 0; i < 100; i++ ){
// Scroll to the bottom
window.scrollTo(0,18000)
// Wait for the bottom data to load
await sleep(3)
// Scroll the bottom again , Prevent data from not loading
window.scrollTo(0,18000)
// Wait for the bottom data to load
await sleep(2)
// Calculate the total sales volume of all commodity prices
await getTotal()
// Go to the next page
document.querySelector('#J_bottomPage > span.p-num > a.pn-next').click()
// Wait for the next page of data
await sleep(3)
}
}
async function getTotal() {
let pageTotal = 0
document.querySelectorAll('#J_goodsList > ul > li').forEach(el => {
// commodity price
const price = parseFloat(el.querySelector('.p-price i').innerText)
// Commodity evaluation quantity
const saleNum = getNumber(el.querySelector('.p-commit a').innerText)
console.log(price, saleNum)
//
pageTotal += price * saleNum
})
// Total sales on this page
const res = await fetch('http://localhost:9000/save', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({pageTotal})
})
const json = await res.json()
console.log('Success:', json);
}
// Run the program
main()
})();
Copy code
- First one for loop , Fix 100, Because Jingdong's product list page is 100 page
- Next scroll to the bottom of the page , Because part of the data in the list is ajax Asynchronously loaded
sleep
Function to wait for a fixed time , Use async await grammar- Then wait 3 second , Then scroll to the bottom , Prevent data from not loading
- Then use
document.querySelectorAll
Get all the items on the page - Then use
document.querySelector
Get the price and evaluation quantity of each commodity - Calculate total page sales
pageTotal
- And then use
fetch
requestNode.js
Storage api, Store the sales calculated on the current page , For subsequent analysis - Finally, go to Jingdong's home page , Search for moon cakes , Go to the search page , Wait for the page to turn to the last page 100 page , Data acquisition complete , Then you can do something else , It's been a long time .
below , Let's take a look at the demonstration effect
[ Nuggets can't upload videos , Purring ...]
3. use Express Set up a platform for storage and analysis api
The code is as follows
const express = require('express')
const cors = require('cors');
const path = require('path')
const fs = require('fs')
var app = express();
app.use(express.json())
app.use(express.urlencoded({extended: true}))
app.use(cors())
// Get statistical data
app.get('/get', (req, res) => {
const data = []
// Get the total sales volume on the specified date
const getTotal = (date) => {
const filePath = path.join(__dirname, 'data', `${date}.json`)
if (!fs.existsSync(filePath)) {
return 0
}
const data = JSON.parse(fs.readFileSync(filePath))
if (data.today) {
return data.total;
}
const total = data.data.reduce((total, currentValue) => {
return total + Math.floor(currentValue) / 10000;
})
// Total number of caches , Don't count next time
data.total = total; // Ten thousand units
fs.writeFileSync(filePath, JSON.stringify(data))
return total;
}
// Gets the day before the specified date
const getLastDay = (dateTime) => {
let date_ob = new Date(dateTime);
date_ob.setDate(date_ob.getDate() - 1)
let date = date_ob.getDate();
let month = date_ob.getMonth() + 1;
let year = date_ob.getFullYear();
let today = year + "-" + month + "-" + date;
return today
}
// Data on all statistical dates
const dateList = fs.readdirSync(path.join(__dirname, 'data'))
// Return the data , Calculate the increase over the previous day
dateList.forEach(fileName => {
const date = fileName.replace('.json', '')
data.push({
date,
total: Math.floor(getTotal(date) / 10000) + ' Billion ',
thanLastDay: getTotal(getLastDay(date)) !== 0 ? Math.floor(getTotal(date) - getTotal(getLastDay(date))) + ' ten thousand ' : ' Temporarily no data '
})
})
// Sort by date in descending order
res.send(data.sort((a,b) => new Date(b.date) - new Date(a.date)))
});
// Store the of the day 100 Page commodity sales
app.post('/save', (req, res) => {
// Get current date
let date_ob = new Date();
let date = date_ob.getDate();
let month = date_ob.getMonth() + 1;
let year = date_ob.getFullYear();
let today = year + "-" + month + "-" + date;
// File path
const filePath = path.join(__dirname, 'data', `${today}.json`)
// If no storage file exists
if (!fs.existsSync(filePath)) {
fs.writeFileSync(filePath, JSON.stringify({data: []}))
}
// Read the file
const data = JSON.parse(fs.readFileSync(filePath))
// Save the sales under all goods on the current page
data.data.push(req.body.pageTotal)
// Write to json file
fs.writeFileSync(filePath, JSON.stringify(data))
// Return the data
res.send(data);
});
app.listen(3000, function () {
console.log(' Service started successfully :http://localhost:3000');
});
Copy code
There are mainly two here api Interface
GET - http://localhost:9000/get
Copy code
Used to obtain statistical data , The structure is as follows
[
{
"date": "2021-9-8", // date
"total": "88615 Billion ", // Total sales
"thanLastDay": "4338 ten thousand " // An increase in sales over yesterday
},
{
"date": "2021-9-7",
"total": "88615 Billion ",
"thanLastDay": " Temporarily no data "
}
]
Copy code
POST - http://localhost:9000/save
Copy code
Used to store sales per page for the day , The data will be stored in data/ The current date .json
In the document
{"data":[885434000,692030500,234544840,601344769.5,172129350,182674704.6,133972752.6,205753590,80450922,77355786.19999999,151456533,110421752,92058113.7,303276508,174283087.7,271311291.3,63696476.8,141753035.7,338476616.4,270641094,86462147,27128625,36139929,45965566.900000006,72166439.10000001,192549501,10540359.4,69775609.4,22760644,18128574.6,4775594.2,11293833.100000001,69100044.5,18697712.7,5837212.3,10642395.6,12401900.700000003,7687292.750000001,5542854.199999999,6173778.3,15844723.86,312611521.7,322072634.2,57924578,365159510,31830203.6,37628351.7,11473636.700000001,25383806.799999997,30270479.9,82777935.4,71801949,17886438.4,76748973.5,29326328.4,11953917.4,5390966.8,25723722.5,9660846,33003014.7,35118788.5,11297238.8,7611442.84,19172848.34,6824560,18840682.700000003,13633325.1,61348156.3,32949962.4,28584186.1,25574649.3,40607000.4,27084038.700000003,34280644.35,13503164.6,7837763.899999999,27559845.42,12587807.8,11210537.2,10225227.48,14791757.24,14573441.399999999,5919098.6,7467049.7,26552201.6,6259477.100000001,7240613.68,5715078,5421074.500000001,6174596.500000001,12098670,3628428.2,5442460.100000001,6925294.8,16266156.259999998,7562844.060000001,16977870.1,6701592.3999999985,6060801,6081381.699999999]}
Copy code
- The project mainly uses
fs.writeFileSync
andfs.readFileSync
Read and write JSON file cors()
Middleware to open cross domain
4. Deploy to Tencent cloud Serverless service
Last , I'll take this. Express Deploy services to the cloud , So that everyone can see
-
modify express The project listening port is 9000,( Tencent cloud must be 9000)
-
establish scf_bootstrap Startup file
#!/bin/sh
npm run start
Copy code
-
land Tencent cloud Serverless Console , Click the function service on the left
-
Hit the new button
-
choice 【 Custom creation 】
- Function type : choice “Web function ”.
- The name of the function : Fill in your own function name .
- regional : Fill in your function deployment region , The default is Guangzhou .
- Running environment : choice “Nodejs 12.16”.
- Deployment way : choice “ Code deployment ”, Upload your local project .
- Submission method : choice “ Local upload folder ”.
- function code : Select the specific local folder of the function code .
- Choose to finish
See the following video for detailed steps
[ Nuggets can't upload videos , Purring ...]
After successful deployment , We will the address provided by Tencent cloud , Can be used to test services
service-ehtglv6w-1258235229.gz.apigw.tencentcs.com/release/get
Be careful :
- Tencent cloud Serverless There is a certain amount of free use , See for details
- Serverless It is allowed to modify the file , therefore
/save
The service will report an error , The solution is to mount CFS file system , I'm too lazy to do , You have to pay .
Github Source code :
5. summary
Last , Even if our whole function is finished , Climb from the oil monkey every day 100 Pages of data , To use express Store in JSON file , Then to calculate the difference every day . Realize the demand of calculating the sales of moon cakes every day .