分类目录归档:未分类

正则表达式中?=和?:和?!的理解

// 前瞻:
exp1(?=exp2) 查找exp2前面的exp1
// 后顾:
(?<=exp2)exp1 查找exp2后面的exp1
// 负前瞻:
exp1(?!exp2) 查找后面不是exp2的exp1
// 负后顾:
(?<!exp2)exp1 查找前面不是exp2的exp1

()表示捕获分组,()会把每个分组里的匹配的值保存起来,使用$n(n是一个数字,表示第n个捕获组的内容),或者用(<groupname>)来给分组取个名字。
(?:)表示非捕获分组,和捕获分组唯一的区别在于,非捕获分组匹配的值不会保存起来

you-get 是python3开发的跨平台命令行视频下载工具

https://github.com/soimort/you-get

Prerequisites

The following dependencies are necessary:

Option 1: Install via pip

The official release of you-get is distributed on PyPI, and can be installed easily from a PyPI mirror via the pippackage manager. Note that you must use the Python 3 version of pip:

$ pip3 install you-get

Option 2: Install via Antigen (for Zsh users)

Add the following line to your .zshrc:

antigen bundle soimort/you-get

Option 3: Download from GitHub

You may either download the stable (identical with the latest release on PyPI) or the develop (more hotfixes, unstable features) branch of you-get. Unzip it, and put the directory containing the you-get script into your PATH.

Alternatively, run

$ [sudo] python3 setup.py install

Or

$ python3 setup.py install --user

to install you-get to a permanent path.

Option 4: Git clone

This is the recommended way for all developers, even if you don’t often code in Python.

$ git clone git://github.com/soimort/you-get.git

Then put the cloned directory into your PATH, or run ./setup.py install to install you-get to a permanent path.

Option 5: Homebrew (Mac only)

You can install you-get easily via:

$ brew install you-get

Option 6: pkg (FreeBSD only)

You can install you-get easily via:

# pkg install you-get

Shell completion

Completion definitions for Bash, Fish and Zsh can be found in contrib/completion. Please consult your shell’s manual for how to take advantage of them.

Upgrading

Based on which option you chose to install you-get, you may upgrade it via:

$ pip3 install --upgrade you-get

or download the latest release via:

$ you-get https://github.com/soimort/you-get/archive/master.zip

In order to get the latest develop branch without messing up the PIP, you can try:

$ pip3 install --upgrade git+https://github.com/soimort/you-get@develop

Getting Started

Download a video

When you get a video of interest, you might want to use the --info/-i option to see all available quality and formats:

$ you-get -i 'https://www.youtube.com/watch?v=jNQXAC9IVRw'
site:                YouTube
title:               Me at the zoo
streams:             # Available quality and codecs
    [ DASH ] ____________________________________
    - itag:          242
      container:     webm
      quality:       320x240
      size:          0.6 MiB (618358 bytes)
    # download-with: you-get --itag=242 [URL]

    - itag:          395
      container:     mp4
      quality:       320x240
      size:          0.5 MiB (550743 bytes)
    # download-with: you-get --itag=395 [URL]

    - itag:          133
      container:     mp4
      quality:       320x240
      size:          0.5 MiB (498558 bytes)
    # download-with: you-get --itag=133 [URL]

    - itag:          278
      container:     webm
      quality:       192x144
      size:          0.4 MiB (392857 bytes)
    # download-with: you-get --itag=278 [URL]

    - itag:          160
      container:     mp4
      quality:       192x144
      size:          0.4 MiB (370882 bytes)
    # download-with: you-get --itag=160 [URL]

    - itag:          394
      container:     mp4
      quality:       192x144
      size:          0.4 MiB (367261 bytes)
    # download-with: you-get --itag=394 [URL]

    [ DEFAULT ] _________________________________
    - itag:          43
      container:     webm
      quality:       medium
      size:          0.5 MiB (568748 bytes)
    # download-with: you-get --itag=43 [URL]

    - itag:          18
      container:     mp4
      quality:       small
    # download-with: you-get --itag=18 [URL]

    - itag:          36
      container:     3gp
      quality:       small
    # download-with: you-get --itag=36 [URL]

    - itag:          17
      container:     3gp
      quality:       small
    # download-with: you-get --itag=17 [URL]

By default, the one on the top is the one you will get. If that looks cool to you, download it:

$ you-get 'https://www.youtube.com/watch?v=jNQXAC9IVRw'
site:                YouTube
title:               Me at the zoo
stream:
    - itag:          242
      container:     webm
      quality:       320x240
      size:          0.6 MiB (618358 bytes)
    # download-with: you-get --itag=242 [URL]

Downloading Me at the zoo.webm ...
 100% (  0.6/  0.6MB) ├██████████████████████████████████████████████████████████████████████████████┤[2/2]    2 MB/s
Merging video parts... Merged into Me at the zoo.webm

Saving Me at the zoo.en.srt ... Done.

(If a YouTube video has any closed captions, they will be downloaded together with the video file, in SubRip subtitle format.)

Or, if you prefer another format (mp4), just use whatever the option you-get shows to you:

$ you-get --itag=18 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

Note:

  • At this point, format selection has not been generally implemented for most of our supported sites; in that case, the default format to download is the one with the highest quality.
  • ffmpeg is a required dependency, for downloading and joining videos streamed in multiple parts (e.g. on some sites like Youku), and for YouTube videos of 1080p or high resolution.
  • If you don’t want you-get to join video parts after downloading them, use the --no-merge/-n option.

Download anything else

If you already have the URL of the exact resource you want, you can download it directly with:

$ you-get https://stallman.org/rms.jpg
Site:       stallman.org
Title:      rms
Type:       JPEG Image (image/jpeg)
Size:       0.06 MiB (66482 Bytes)

Downloading rms.jpg ...
100.0% (  0.1/0.1  MB) ├████████████████████████████████████████┤[1/1]  127 kB/s

Otherwise, you-get will scrape the web page and try to figure out if there’s anything interesting to you:

$ you-get http://kopasas.tumblr.com/post/69361932517
Site:       Tumblr.com
Title:      kopasas
Type:       Unknown type (None)
Size:       0.51 MiB (536583 Bytes)

Site:       Tumblr.com
Title:      tumblr_mxhg13jx4n1sftq6do1_1280
Type:       Portable Network Graphics (image/png)
Size:       0.51 MiB (536583 Bytes)

Downloading tumblr_mxhg13jx4n1sftq6do1_1280.png ...
100.0% (  0.5/0.5  MB) ├████████████████████████████████████████┤[1/1]   22 MB/s

Note:

  • This feature is an experimental one and far from perfect. It works best on scraping large-sized images from popular websites like Tumblr and Blogger, but there is really no universal pattern that can apply to any site on the Internet.

Search on Google Videos and download

You can pass literally anything to you-get. If it isn’t a valid URL, you-get will do a Google search and download the most relevant video for you. (It might not be exactly the thing you wish to see, but still very likely.)

$ you-get "Richard Stallman eats"

Pause and resume a download

You may use Ctrl+C to interrupt a download.

A temporary .download file is kept in the output directory. Next time you run you-get with the same arguments, the download progress will resume from the last session. In case the file is completely downloaded (the temporary .download extension is gone), you-get will just skip the download.

To enforce re-downloading, use the --force/-f option. (Warning: doing so will overwrite any existing file or temporary file with the same name!)

Set the path and name of downloaded file

Use the --output-dir/-o option to set the path, and --output-filename/-O to set the name of the downloaded file:

$ you-get -o ~/Videos -O zoo.webm 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

Tips:

  • These options are helpful if you encounter problems with the default video titles, which may contain special characters that do not play well with your current shell / operating system / filesystem.
  • These options are also helpful if you write a script to batch download files and put them into designated folders with designated names.

Proxy settings

You may specify an HTTP proxy for you-get to use, via the --http-proxy/-x option:

$ you-get -x 127.0.0.1:8087 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

However, the system proxy setting (i.e. the environment variable http_proxy) is applied by default. To disable any proxy, use the --no-proxy option.

Tips:

  • If you need to use proxies a lot (in case your network is blocking certain sites), you might want to use you-get with proxychains and set alias you-get="proxychains -q you-get" (in Bash).
  • For some websites (e.g. Youku), if you need access to some videos that are only available in mainland China, there is an option of using a specific proxy to extract video information from the site: --extractor-proxy/-y.

Watch a video

Use the --player/-p option to feed the video into your media player of choice, e.g. mpv or vlc, instead of downloading it:

$ you-get -p vlc 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

Or, if you prefer to watch the video in a browser, just without ads or comment section:

$ you-get -p chromium 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

Tips:

  • It is possible to use the -p option to start another download manager, e.g., you-get -p uget-gtk 'https://www.youtube.com/watch?v=jNQXAC9IVRw', though they may not play together very well.

Load cookies

Not all videos are publicly available to anyone. If you need to log in your account to access something (e.g., a private video), it would be unavoidable to feed the browser cookies to you-get via the --cookies/-c option.

Note:

  • As of now, we are supporting two formats of browser cookies: Mozilla cookies.sqlite and Netscape cookies.txt.

Reuse extracted data

Use --url/-u to get a list of downloadable resource URLs extracted from the page. Use --json to get an abstract of extracted data in the JSON format.