pwdl_FInal-01 / README.md
AkshitShubham's picture
Upload folder using huggingface_hub
4712303 verified
metadata
title: PhysicsWallah M3u8 Parser
emoji: πŸ’»πŸ³
colorFrom: gray
colorTo: green
sdk: docker
pinned: false
suggested_storage: small
license: mit

PhysicsWallah M3u8 Parser

This is a Python script that parses M3u8 files. It uses the argparse library to handle command-line arguments.

Dependencies

The script requires the following executables to be available in the PATH or the user should provide the path to the executables:

The script also requires the following Python libraries (which are listed in the requirements.txt file):

  • requests: A library for making HTTP requests. It abstracts the complexities of making requests behind a simple API, allowing you to send HTTP/1.1 requests.
  • colorama: Makes ANSI escape character sequences work on Windows and Unix systems, allowing colored terminal text and cursor positioning.
  • argparse: Provides a way to specify command line arguments and options the program is supposed to accept.
  • bs4 (BeautifulSoup4): A library for pulling data out of HTML and XML files. It provides Pythonic idioms for iterating, searching, and modifying the parse tree.
  • flask: A micro web framework written in Python. It does not require particular tools or libraries, it has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions.
  • flask_socketio: Gives Flask applications access to low latency bi-directional communications between the clients and the server. The client-side application can use any of the Socket.IO official clients libraries in Javascript, C++, Java and Swift, or any compatible client to establish a permanent connection to the server.

To install these dependencies, you would typically run pip install -r requirements.txt in your command line.

or if you want to install them individually, you can run the following commands:

pip install requests colorama argparse bs4 flask flask_socketio

Usage

You can use the script with the following command-line arguments:

  • --csv-file: Input csv file. Legacy Support too.
  • --id: PhysicsWallah Video Id for single usage. Incompatible with --csv-file. Must be used with --name.
  • --name: Name for the output file. Incompatible with --csv-file. Must be used with --url.
  • --dir: Output Directory.
  • --verbose: Verbose Output.
  • --version: Shows the version of the program.
  • --simulate: Simulate the download process. No files will be downloaded.

Example

python pwdl.py --csv-file input.csv --dir ./output --verbose

This will parse the M3u8 files listed in input.csv and save the output in the ./output directory. The --verbose flag is used to enable verbose output.

Error Handling

The script has built-in error handling. If an error occurs during the parsing of a file, the script will print an error message and continue with the next file. If both csv file and id (or name) is provided, the script will exit with error code 3.

User Preferences

User preferences can be loaded from a defaults.json file. These preferences include the temporary directory (tmpDir), verbosity of output (verbose), and whether to display a horizontal rule (hr). If these preferences are not set in the defaults.json file, the script will use default values.

Note: The defaults.json file must now also include the token for the video download process.

Simulation Mode

The script includes a simulation mode, which can be enabled with the --simulate flag. In this mode, the script will print the files that would be processed, but no files will be downloaded.

Shell Mode

The script includes a shell mode, which can be enabled with the --shell flag.

API Endpoints

This section describes the API endpoints provided by api.py.

Create Task

Endpoint: /create_task

Method: POST

Description: This endpoint is used to create a new download task. It requires a JSON payload with the id and name of the video to be downloaded.

Payload:

{
    "id": "<video_id>",
    "name": "<video_name>"
}

Response: The endpoint returns a JSON object with the task_id of the created task.

{
    "task_id": "<task_id>"
}

Get Progress

Endpoint: /progress/<task_id>

Method: GET

Description: This endpoint is used to get the progress of a specific task. Replace <task_id> with the ID of the task.

Response: The endpoint returns a JSON object with the progress of the task.

Get File

Endpoint: /get-file/<task_id>/<name>

Method: GET

Description: This endpoint is used to download the file associated with a specific task. Replace <task_id> with the ID of the task and <name> with the name of the file.

Response: The endpoint returns the requested file as a download.

Index

Endpoint: /

Method: GET

Description: This is the index endpoint of the API. It returns a simple greeting message.

Response: The endpoint returns a JSON object with a greeting message.

{
    "message": "Hello, World!"
}

Please note that the API must be running for these endpoints to be accessible. You can start the API by running python ./beta/api/app.py.

Error Codes

Error Name Error Code Error Message
noError 0 None
defaultsNotFound 1 defaults.json not found. Exiting...
dependencyNotFound 2 Dependency not found. Exiting...
dependencyNotFoundInPrefs 3 Dependency not found in default settings. Exiting...
csvFileNotFound 4 CSV file {fileName} not found. Exiting...
downloadFailed 5 Download failed for {name} with id {id}. Exiting...
couldNotMakeDir 6 Could not make directory {dir}. Exiting...
tokenNotFound 7 Token not found. Exiting...
cantLoadFile 22 Can't load file {fileName}
requestFailedDueToUnknownReason 24 Request failed due to unknown reason. Status Code: {status_code}
keyExtractionFailed 25 Key extraction failed for id -> {id}. Exiting...
keyNotProvided 26 Key not provided. Exiting...
couldNotDownloadAudio 27 Could not download audio for id -> {id} Exiting...
couldNotDownloadVideo 28 Could not download video for {id} Exiting...
couldNotDecryptAudio 29 Could not decrypt audio. Exiting...
couldNotDecryptVideo 30 Could not decrypt video. Exiting...
methodPatched 31 Method is patched. Exiting...
couldNotExtractKey 32 Could not extract key. Exiting...

Please note that the {fileName}, {name}, {id}, and {status_code} in the Error Message column are placeholders and will be replaced with actual values when the error occurs.