Home
Softono
WebScrapping

WebScrapping

Open source Python
42
Stars
21
Forks
0
Issues
2
Watchers
8 years
Last Commit

About WebScrapping

WebScrapping is a Python-based utility designed for comprehensive data mining, analysis, and visualization from web sources. The software automates the extraction of structured data, including product categories, activities, and purchase counts, by initially fetching content from a target landing page and then programmatically iterating through all subsequent paginated pages. Once the raw data is collected, the system applies statistical techniques to perform mathematical analysis, transforming the information into meaningful insights. The final output is presented through visual representations generated by the script. The tool requires Python 2 or 3 and relies on essential libraries such as requests for handling HTTP requests, BeautifulSoup for parsing HTML, Pandas for data manipulation, Matplotlib for charting, and Regular Expressions for pattern matching. To use the software, users must install these dependencies and execute the provided Python script via the terminal. This solution is ideal for researche

Platforms

Web Self-hosted

Languages

Python

Links

Webscrapping

Web Scraping using Python, Data mining , Data Analyzing & Data Visualization of the collected Data.

Getting Started

These instructions will get idea of the project up and running on your local machine for development and Execution purposes. See deployment for notes on how to deploy the project on a live system.

The python script is written to fetch all the individual categories from the website (http://www.xyz.com ), The code is written for fetching the data from the first page and it iterates to each and every pages of website ( activities, categories, count of bought), and I used statistical techniques for mathematically analysis and presenting the data into visualization

Prerequisites

What things you need to install the software and how to install them

Python2/Python3

Installing

A step by step have to get a development env running

step 1. python2 or python3
step 2. pip install bs4
step 3. pip install requests
step 4. pip install re
step 5. pip install pandas
step 6. pip install matplotlib

Running the Script

Enter terminal or console windows/Linux/mac.

Like : python filename

  • Documents - Complete Project details ppt
    python Filename

Authors