Files
YouTube.js/lib/parser/README.md
Daniel Wykerd fb68e6bcfe feat!: better cross runtime support (#97)
* refactor: remove dependancies

removes node-forge and uuid in favor of Web APIs

* refactor!: commonjs to es6

To aid with #93 I will make all my changes in TypeScript instead.
This is the first step into making that happen.

Used: https://github.com/wessberg/cjstoesm

* refactor!: NToken and Signature TS files

Bring this PR up to speed with #93

* feat: cross platform cache (WIP)

this is untested!
should remove idb as dependecy.

* feat: EventEmitter polyfill

* refactor: remove events

* feat: HTTPClient based on Fetch API (WIP)

* refactor!: parsers refactor (WIP)

Initial TS support for parsers as per #93

This adds several type safety checks to the parser which'll help to
ensure valid data is returned by the parser.

* refactor!: parsers refactor (WIP)

Bring more in line with the existing implementations & make less verbose

* refactor!: parser refactor

I was overcomplicating things, this is much simpler and compatible with
the existing JS API

* fix: some missed parsers while refactoring

* fix: better type inferance for parseResponse

* feat(TS): typesafe YTNode casts

* feat: more type safety in YTNode and Parser

* refactor: VideoInfo download with fetch & TS (WIP)

Again, this also does some work for #93

* fix: LiveChat in VideoInfo

* refactor!: more typesafety in parser

* refactor!: VideoInfo almost completed

* refactor!: player and session refactors

- Remove the Player class' dependance on Session.
- Add additional context to the Session.

* refactor!: move auth logic to Session (WIP)

* refactor: TS port for Actions and Innertube

My fingers hurt from typing out all those types :-P

* refactor: NavigationEndpoint TS

this is still a WIP and should be improved.
NavigationEndpoint should probably be refactored further.

* refactor!: VideoInfo compiles without errors

* chore: delete old player

* fix: import errors

It compiles and runs!!

* fix: Utils import fixes

* fix: several runtime errors

* fix: video streaming

* chore: remove console.log debugging

Whoops, forgot to remove these before I pushed the previous commit

* chore: remove old unused dependencies

* fix: typescript errors

Now emitting declarations and source maps

* refactor: TS feed

* chore: delete old Feed

* refactor: move streamToIterable into Utils

* refactor: AccountManager TS

* refactor: FilterableFeed to TS

* refactor: InteractionManager to TS

* refactor: PlaylistManager to TS

* refactor: TabbedFeed to TS

* refactor: Music to TS (WIP)

more work to be done, see TODO comments

* fix: getting the tests to pass (6/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (1152 ms)
      ✕ Should search on YouTube Music (705 ms)
      ✕ Should retrieve YouTube search suggestions (722 ms)
      ✓ Should retrieve YouTube Music search suggestions (233 ms)
    Comments
      ✓ Should retrieve comments (585 ms)
      ✕ Should retrieve next batch of comments (221 ms)
      ✕ Should retrieve comment replies (1 ms)
    General
      ✕ Should retrieve playlist with YouTube (732 ms)
      ✓ Should retrieve home feed (838 ms)
      ✓ Should retrieve trending content (543 ms)
      ✓ Should retrieve video info (639 ms)
      ✕ Should download video (5 ms)

* fix: tests (7/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (1984 ms)
      ✕ Should search on YouTube Music (1139 ms)
      ✕ Should retrieve YouTube search suggestions (1433 ms)
      ✓ Should retrieve YouTube Music search suggestions (529 ms)
    Comments
      ✓ Should retrieve comments (324 ms)
      ✓ Should retrieve next batch of comments (395 ms)
      ✕ Should retrieve comment replies
    General
      ✕ Should retrieve playlist with YouTube (653 ms)
      ✓ Should retrieve home feed (1085 ms)
      ✓ Should retrieve trending content (513 ms)
      ✓ Should retrieve video info (921 ms)
      ✕ Should download video (3 ms)

* fix: download tests (8/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (1293 ms)
      ✕ Should search on YouTube Music (927 ms)
      ✕ Should retrieve YouTube search suggestions (1250 ms)
      ✓ Should retrieve YouTube Music search suggestions (258 ms)
    Comments
      ✓ Should retrieve comments (803 ms)
      ✓ Should retrieve next batch of comments (511 ms)
      ✕ Should retrieve comment replies
    General
      ✕ Should retrieve playlist with YouTube (528 ms)
      ✓ Should retrieve home feed (1047 ms)
      ✓ Should retrieve trending content (548 ms)
      ✓ Should retrieve video info (825 ms)
      ✓ Should download video (1779 ms)

* fix: tests (9/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (1276 ms)
      ✕ Should search on YouTube Music (955 ms)
      ✓ Should retrieve YouTube search suggestions (661 ms)
      ✓ Should retrieve YouTube Music search suggestions (491 ms)
    Comments
      ✓ Should retrieve comments (624 ms)
      ✓ Should retrieve next batch of comments (353 ms)
      ✕ Should retrieve comment replies
    General
      ✕ Should retrieve playlist with YouTube (672 ms)
      ✓ Should retrieve home feed (1277 ms)
      ✓ Should retrieve trending content (999 ms)
      ✓ Should retrieve video info (1106 ms)
      ✓ Should download video (2514 ms)

* feat: key based type validation for parsers

* fix: comments tests pass (10/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (938 ms)
      ✕ Should search on YouTube Music (850 ms)
      ✓ Should retrieve YouTube search suggestions (528 ms)
      ✓ Should retrieve YouTube Music search suggestions (224 ms)
    Comments
      ✓ Should retrieve comments (518 ms)
      ✓ Should retrieve next batch of comments (337 ms)
      ✓ Should retrieve comment replies (358 ms)
    General
      ✕ Should retrieve playlist with YouTube (466 ms)
      ✓ Should retrieve home feed (1051 ms)
      ✓ Should retrieve trending content (623 ms)
      ✓ Should retrieve video info (863 ms)
      ✓ Should download video (2656 ms)

* refactor: type safety checks removing @ts-ignore

* fix: playlist tests pass (11/12)

YouTube.js Tests
    Search
      ✓ Should search on YouTube (991 ms)
      ✕ Should search on YouTube Music (924 ms)
      ✓ Should retrieve YouTube search suggestions (606 ms)
      ✓ Should retrieve YouTube Music search suggestions (225 ms)
    Comments
      ✓ Should retrieve comments (393 ms)
      ✓ Should retrieve next batch of comments (284 ms)
      ✓ Should retrieve comment replies (252 ms)
    General
      ✓ Should retrieve playlist with YouTube (578 ms)
      ✓ Should retrieve home feed (1148 ms)
      ✓ Should retrieve trending content (541 ms)
      ✓ Should retrieve video info (799 ms)
      ✓ Should download video (1419 ms)

* fix: all tests pass for node 🎉

YouTube.js Tests
    Search
      ✓ Should search on YouTube (1053 ms)
      ✓ Should search on YouTube Music (761 ms)
      ✓ Should retrieve YouTube search suggestions (453 ms)
      ✓ Should retrieve YouTube Music search suggestions (221 ms)
    Comments
      ✓ Should retrieve comments (627 ms)
      ✓ Should retrieve next batch of comments (412 ms)
      ✓ Should retrieve comment replies (268 ms)
    General
      ✓ Should retrieve playlist with YouTube (565 ms)
      ✓ Should retrieve home feed (775 ms)
      ✓ Should retrieve trending content (498 ms)
      ✓ Should retrieve video info (875 ms)
      ✓ Should download video (1364 ms)

* build: working Deno bundle

Still need to test whether this bundle works in the browser

* docs: update deno example to download video

* refactor: MusicResponsiveListItem to TS

* docs: TSDoc for Parser helpers

* docs: Parser documentation for TS

* docs: add note about parseItem and parseArray

* test: remove browser tests since they're identical

* feat: browser support and proxy example

* fix: PlaylistManager TS after merge

* feat: in-browser video streaming

* refactor: cleanup the Dash example

* feat: allow custom fetch implementations

* feat: fetch debugger

* fix: OAuth login

* refactor: remove file extensions from imports

* refactor: build scripts

* fix: CustomEvent on node

* fix: LiveChat

* fix: linting

* fix: liniting in build-parser-json

* chore: update test workflow

* fix: NToken errors after lint fixes

* fix: codacy complaints

* docs: update to reflect changes

Definitly needs more work but its a start

* refactor: cleanup imports/exports

* fix: browser example

- Remove user-agent before making request.
- Fix cache on browsers

* fix: cache on node

* fix: stupid mistake

* refactor: Session#signIn to wait untill success

This also splits the 'auth' event up into 3 distinct events:
- 'auth' -> fired on success
- 'auth-pending' -> fired when pending authentication
- 'auth-error' -> fired when an error occurred

* refactor: freeze Constants

* refactor: cleanup HTTPClient Request

* refactor: debugFetch readability

* chore: lint

* refactor: replace jsdoc with tsdoc eslint plugin

remove @param annotations without descriptions

* fix: bunch of liniting warnings

* refactor: better inference on YTNode#is

As suggested by @MasterOfBob777

* fix: linting warnings

* revert: undici import

* refactor: rename `list_type` to `item_type`
2022-07-20 14:06:12 -03:00

10 KiB

Parser

Sanitizes and standardizes InnerTube responses while maintaining the integrity of the data. Also drastically improves how API calls are made and handled.

API

parse(data)

Responsible for parsing specifically the contents property of the response object.

Param Type Description
data any The contents property
requireArray ?boolean Whether the response should be an array
validTypes `YTNodeConstructor YTNodeConstructor[]

When requireArray is true, the response will be an ObservedArray<YTNodes>.

When validTypes is undefined, the response will be an array of YTNodes.

When validTypes is an array, the response will be an array of YTNodes that are of the types specified in the array.

When validTypes is a single type, the response will be an array of YTNodes that are of the type specified.

If you do not specify requireArray, the return type of the function will not be known at runtime, and therefore we return the response wrapped in a helper, SuperParsedResponse, to gain access to the response.

You may use the Parser#parseArray and Parser#parseItem methods to parse the response in a deterministic way.

parseResponse(data)

Unlike parse, this can be used to parse the entire response object.

Param Type Description
data object Raw InnerTube response

Returns: object

ObservedArray

You may use ObservedArray<T extends YTNode> as a normal array, but it provides additional methods for typesafe access and casting.

// for example, we have a feed, and want all the videos
const feed = new ObservedArray<YTNode>([...feed.contents]);
const videos = feed.filterType(GridVideo);
// This is now a GridVideo[]

// or we want only the first video
const firstVideo = feed.firstOfType(GridVideo);

// We may cast the whole array to a GridVideo[] and throw if we have any non-GridVideo elements
const allVideos = feed.as(GridVideo);

// There's some extra methods for ObservedArray<T extends YTNode>
// which we use internally but not documented here (yet).
// see the source code for more details.

SuperParsedResponse

Represents a parsed response in an unknown state. Either a YTNode or a ObservedArray<YTNode> or null.

You will need to assert the type and unwrap the response to get the actual value.

// We can assert we have a YTNode
const response = Parser.parse(data);
if (response.is_item) {
  const node = response.item();
}

// We can assert we have an ObservedArray<YTNode>
const response = Parser.parse(data);
if (response.is_array) {
  const nodes = response.array();
}

// or lastly a null response
const response = Parser.parse(data);
const is_null = response.is_null;

YTNode

All renderers returned by Innertube are converted to this generic class and then extended for the specific renderers.

This class is allows us a typesafe way to use data returned by the Innertube API.

Here's how to use this class to access returned data:

Type Casting

// We can cast a YTNode to a child class of YTNode
const results = node.as(TwoColumnSearchResults);
// This will throw if the node is not a TwoColumnSearchResults
// We thus may want to check for the type of the node before casting
if (node.is(TwoColumnSearchResults)) {
  // We do not need to recast the node, it is already a TwoColumnSearchResults after calling is() and using it in the branch where is() returns true
  const results = node;
}

// Sometimes we can expect multiple types of nodes, we can just pass all possible types as params
const results = node.as(TwoColumnSearchResults, VideoList);
// The type of `results` will now be `TwoColumnSearchResults | VideoList`

// similarly, we can check if the node is of a certain type
if (node.is(TwoColumnSearchResults, VideoList)) {
  // Again no casting is needed, the node is already of the correct type
  const results = node;
}

Accessing properties without casting

Sometimes multiple nodes have the same properties and we don't want to check the type of the node before accessing the property, for example the property "contents" is used by many node types, and we may add more in the future, as such we want to only assert the property instead of casting to a specific type.

// Accesing a property on a node which you aren't sure if it exists
const prop = node.key("contents");
// This returns the value wrapped into a `Maybe` type
// which you can use to find the type of the value
// note however, this throws an error if the key doesn't exist
// we may want to check for the key before accessing it
if (node.hasKey("contents")) {
  const prop = node.key("contents");
}

// we can assert the type of the value
const prop = node.key("contents");
if (prop.isString()) {
  const value = prop.string();
}

// we can do more complex assertions too,
// like checking for instanceof
const prop = node.key("contents");
if (prop.isInstanceof(Text)) {
  const text = prop.instanceof(Text);
  // and then use the value as the given type
  text.runs.forEach(run => {
    console.log(run.text);
  });
}

// theres some special methods for using with the parser
// such as getting the value as a YTNode
const prop = node.key("contents");
if (prop.isNode()) {
  const node = prop.node();
}

// Like with YTNode, keys can also be checked for YTNode child class types
const prop = node.key("contents");
if (prop.isNodeOfType(TwoColumnSearchResults)) {
  const results = prop.nodeOfType(TwoColumnSearchResults);
}

// or we can check for multiple types of nodes
const prop = node.key("contents");
if (prop.isNodeOfType([TwoColumnSearchResults, VideoList])) {
  const results = prop.nodeOfType<TwoColumnSearchResults | VideoList>([TwoColumnSearchResults, VideoList]);
}

// Sometime an ObservedArray is returned when working with parsed data
// We've got an helper for that too
const prop = node.key("contents");
if (prop.isObserved()) {
  const array = prop.observed();

  // Now we may use the all the ObservedArray methods as normal
  // like finding nodes of a certain type for example
  const results = array.filterType(GridVideo);
}

// Other times a SuperParsedResult is returned, like when using the `Parser#parse` method
const prop = node.key("contents");
if (prop.isParsed()) {
  const result = prop.parsed();

  // SuperParsedResult is another helper for typesafe access to the parsed data
  // it is explained above with the `Parser#parse` method
  const results = results.array();
  const videos = results.filterType(Video);
}

// Sometimes we just want to debug something and not interested in finding the type
// This will, however, warn you when being used.
const prop = node.key("contents");
const value = prop.any();

// Arrays are also a special case as every element may be of a different type
// the `arrayOfMaybe` method will return an array of `Maybe`s
const prop = node.key("contents");
if (prop.isArray()) {
  const array = prop.arrayOfMaybe(); 
  // This will return Maybe[]
}

// or if you want zero typesafety you can use the `array` method
const prop = node.key("contents");
if (prop.isArray()) {
  const array = prop.array();
  // This will return any[]
}

Memo

The Memo class is a helper class for memoizing values in the Parser#parseResponse method. It is useful for finding nodes after parsing the response.

Say we want all of the videos in a search result. We can use the Memo to find all of them quickly without recursing through the response.

const response = Parser.parseResponse(data);
const videos = response.contents_memo.getType(Video);
// This returns the nodes as a ObservedArray<Video>

Memo extends Map<string, YTNode[]> and can be used as a normal Map too if you want.

How it works

If you decompile a YouTube client and analize it for a while you will notice that it has classes named protos/youtube/api/innertube/MusicItemRenderer, protos/youtube/api/innertube/SectionListRenderer, etc.

These classes are used to parse objects from the response (which consists of protobuf messages) and also build requests. The website works in a similar way, the difference is that it uses plain JSON (likely converted from protobuf server-side, hence the weird structure of the response).

Here we're taking a similar approach, the parser goes through all the renderers and parses their inner element(s). The final result is a nicely structured JSON, and on top of that it also parses navigation endpoints which allows us to make an API call with all required parameters in one line and even emulate client actions (eg; clicking a button).

Here is your average, arguably ugly InnerTube response:

Click to see

{
  sidebar: {
    playlistSidebarRenderer: {
      items: [
        {
          playlistSidebarPrimaryInfoRenderer: {
            title: {
              simpleText: '..'
            },
            description: {
              runs: [
                {
                  text: '..'
                },
                //....
              ]
            },
            stats: [
              {
                simpleText: '..'
              },
              {
                runs: [
                  {
                    text: '..'
                  }
                ]
              }
            ]
          }
        }
      ]
    }
  }
}

And what we get after parsing it:

Click to see

{
  sidebar: {
    type: 'PlaylistSidebar',
    contents: [
      {
        type: 'PlaylistSidebarPrimaryInfo',
        title: { text: '..' },
        description: { text: '..' },
        stats: [
          {
            text: '..'
          },
          {
            text: '..'
          }
        ]
      }
    ]
  }
}