Initial Commit
This commit is contained in:
commit
87406a2d43
7
.gitignore
vendored
Normal file
7
.gitignore
vendored
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
__pycache__
|
||||||
|
.DS_Store
|
||||||
|
.direnv
|
||||||
|
data
|
||||||
|
venv
|
||||||
|
openai_key
|
||||||
|
vreader.egg-info/
|
18
Dockerfile
Normal file
18
Dockerfile
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
# Build Container
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
# Install App
|
||||||
|
WORKDIR /app
|
||||||
|
COPY . /app
|
||||||
|
|
||||||
|
# Install App & Gunicorn
|
||||||
|
RUN pip install .
|
||||||
|
RUN pip3 install gunicorn
|
||||||
|
|
||||||
|
# Cleanup
|
||||||
|
RUN rm -rf /app
|
||||||
|
|
||||||
|
# Start Application
|
||||||
|
ENTRYPOINT ["gunicorn"]
|
||||||
|
EXPOSE 5000
|
||||||
|
CMD ["vreader:create_app()", "--bind", "0.0.0.0:5000", "--threads=4", "--access-logfile", "-"]
|
339
LICENSE
Normal file
339
LICENSE
Normal file
@ -0,0 +1,339 @@
|
|||||||
|
GNU GENERAL PUBLIC LICENSE
|
||||||
|
Version 2, June 1991
|
||||||
|
|
||||||
|
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||||
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||||
|
Everyone is permitted to copy and distribute verbatim copies
|
||||||
|
of this license document, but changing it is not allowed.
|
||||||
|
|
||||||
|
Preamble
|
||||||
|
|
||||||
|
The licenses for most software are designed to take away your
|
||||||
|
freedom to share and change it. By contrast, the GNU General Public
|
||||||
|
License is intended to guarantee your freedom to share and change free
|
||||||
|
software--to make sure the software is free for all its users. This
|
||||||
|
General Public License applies to most of the Free Software
|
||||||
|
Foundation's software and to any other program whose authors commit to
|
||||||
|
using it. (Some other Free Software Foundation software is covered by
|
||||||
|
the GNU Lesser General Public License instead.) You can apply it to
|
||||||
|
your programs, too.
|
||||||
|
|
||||||
|
When we speak of free software, we are referring to freedom, not
|
||||||
|
price. Our General Public Licenses are designed to make sure that you
|
||||||
|
have the freedom to distribute copies of free software (and charge for
|
||||||
|
this service if you wish), that you receive source code or can get it
|
||||||
|
if you want it, that you can change the software or use pieces of it
|
||||||
|
in new free programs; and that you know you can do these things.
|
||||||
|
|
||||||
|
To protect your rights, we need to make restrictions that forbid
|
||||||
|
anyone to deny you these rights or to ask you to surrender the rights.
|
||||||
|
These restrictions translate to certain responsibilities for you if you
|
||||||
|
distribute copies of the software, or if you modify it.
|
||||||
|
|
||||||
|
For example, if you distribute copies of such a program, whether
|
||||||
|
gratis or for a fee, you must give the recipients all the rights that
|
||||||
|
you have. You must make sure that they, too, receive or can get the
|
||||||
|
source code. And you must show them these terms so they know their
|
||||||
|
rights.
|
||||||
|
|
||||||
|
We protect your rights with two steps: (1) copyright the software, and
|
||||||
|
(2) offer you this license which gives you legal permission to copy,
|
||||||
|
distribute and/or modify the software.
|
||||||
|
|
||||||
|
Also, for each author's protection and ours, we want to make certain
|
||||||
|
that everyone understands that there is no warranty for this free
|
||||||
|
software. If the software is modified by someone else and passed on, we
|
||||||
|
want its recipients to know that what they have is not the original, so
|
||||||
|
that any problems introduced by others will not reflect on the original
|
||||||
|
authors' reputations.
|
||||||
|
|
||||||
|
Finally, any free program is threatened constantly by software
|
||||||
|
patents. We wish to avoid the danger that redistributors of a free
|
||||||
|
program will individually obtain patent licenses, in effect making the
|
||||||
|
program proprietary. To prevent this, we have made it clear that any
|
||||||
|
patent must be licensed for everyone's free use or not licensed at all.
|
||||||
|
|
||||||
|
The precise terms and conditions for copying, distribution and
|
||||||
|
modification follow.
|
||||||
|
|
||||||
|
GNU GENERAL PUBLIC LICENSE
|
||||||
|
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||||
|
|
||||||
|
0. This License applies to any program or other work which contains
|
||||||
|
a notice placed by the copyright holder saying it may be distributed
|
||||||
|
under the terms of this General Public License. The "Program", below,
|
||||||
|
refers to any such program or work, and a "work based on the Program"
|
||||||
|
means either the Program or any derivative work under copyright law:
|
||||||
|
that is to say, a work containing the Program or a portion of it,
|
||||||
|
either verbatim or with modifications and/or translated into another
|
||||||
|
language. (Hereinafter, translation is included without limitation in
|
||||||
|
the term "modification".) Each licensee is addressed as "you".
|
||||||
|
|
||||||
|
Activities other than copying, distribution and modification are not
|
||||||
|
covered by this License; they are outside its scope. The act of
|
||||||
|
running the Program is not restricted, and the output from the Program
|
||||||
|
is covered only if its contents constitute a work based on the
|
||||||
|
Program (independent of having been made by running the Program).
|
||||||
|
Whether that is true depends on what the Program does.
|
||||||
|
|
||||||
|
1. You may copy and distribute verbatim copies of the Program's
|
||||||
|
source code as you receive it, in any medium, provided that you
|
||||||
|
conspicuously and appropriately publish on each copy an appropriate
|
||||||
|
copyright notice and disclaimer of warranty; keep intact all the
|
||||||
|
notices that refer to this License and to the absence of any warranty;
|
||||||
|
and give any other recipients of the Program a copy of this License
|
||||||
|
along with the Program.
|
||||||
|
|
||||||
|
You may charge a fee for the physical act of transferring a copy, and
|
||||||
|
you may at your option offer warranty protection in exchange for a fee.
|
||||||
|
|
||||||
|
2. You may modify your copy or copies of the Program or any portion
|
||||||
|
of it, thus forming a work based on the Program, and copy and
|
||||||
|
distribute such modifications or work under the terms of Section 1
|
||||||
|
above, provided that you also meet all of these conditions:
|
||||||
|
|
||||||
|
a) You must cause the modified files to carry prominent notices
|
||||||
|
stating that you changed the files and the date of any change.
|
||||||
|
|
||||||
|
b) You must cause any work that you distribute or publish, that in
|
||||||
|
whole or in part contains or is derived from the Program or any
|
||||||
|
part thereof, to be licensed as a whole at no charge to all third
|
||||||
|
parties under the terms of this License.
|
||||||
|
|
||||||
|
c) If the modified program normally reads commands interactively
|
||||||
|
when run, you must cause it, when started running for such
|
||||||
|
interactive use in the most ordinary way, to print or display an
|
||||||
|
announcement including an appropriate copyright notice and a
|
||||||
|
notice that there is no warranty (or else, saying that you provide
|
||||||
|
a warranty) and that users may redistribute the program under
|
||||||
|
these conditions, and telling the user how to view a copy of this
|
||||||
|
License. (Exception: if the Program itself is interactive but
|
||||||
|
does not normally print such an announcement, your work based on
|
||||||
|
the Program is not required to print an announcement.)
|
||||||
|
|
||||||
|
These requirements apply to the modified work as a whole. If
|
||||||
|
identifiable sections of that work are not derived from the Program,
|
||||||
|
and can be reasonably considered independent and separate works in
|
||||||
|
themselves, then this License, and its terms, do not apply to those
|
||||||
|
sections when you distribute them as separate works. But when you
|
||||||
|
distribute the same sections as part of a whole which is a work based
|
||||||
|
on the Program, the distribution of the whole must be on the terms of
|
||||||
|
this License, whose permissions for other licensees extend to the
|
||||||
|
entire whole, and thus to each and every part regardless of who wrote it.
|
||||||
|
|
||||||
|
Thus, it is not the intent of this section to claim rights or contest
|
||||||
|
your rights to work written entirely by you; rather, the intent is to
|
||||||
|
exercise the right to control the distribution of derivative or
|
||||||
|
collective works based on the Program.
|
||||||
|
|
||||||
|
In addition, mere aggregation of another work not based on the Program
|
||||||
|
with the Program (or with a work based on the Program) on a volume of
|
||||||
|
a storage or distribution medium does not bring the other work under
|
||||||
|
the scope of this License.
|
||||||
|
|
||||||
|
3. You may copy and distribute the Program (or a work based on it,
|
||||||
|
under Section 2) in object code or executable form under the terms of
|
||||||
|
Sections 1 and 2 above provided that you also do one of the following:
|
||||||
|
|
||||||
|
a) Accompany it with the complete corresponding machine-readable
|
||||||
|
source code, which must be distributed under the terms of Sections
|
||||||
|
1 and 2 above on a medium customarily used for software interchange; or,
|
||||||
|
|
||||||
|
b) Accompany it with a written offer, valid for at least three
|
||||||
|
years, to give any third party, for a charge no more than your
|
||||||
|
cost of physically performing source distribution, a complete
|
||||||
|
machine-readable copy of the corresponding source code, to be
|
||||||
|
distributed under the terms of Sections 1 and 2 above on a medium
|
||||||
|
customarily used for software interchange; or,
|
||||||
|
|
||||||
|
c) Accompany it with the information you received as to the offer
|
||||||
|
to distribute corresponding source code. (This alternative is
|
||||||
|
allowed only for noncommercial distribution and only if you
|
||||||
|
received the program in object code or executable form with such
|
||||||
|
an offer, in accord with Subsection b above.)
|
||||||
|
|
||||||
|
The source code for a work means the preferred form of the work for
|
||||||
|
making modifications to it. For an executable work, complete source
|
||||||
|
code means all the source code for all modules it contains, plus any
|
||||||
|
associated interface definition files, plus the scripts used to
|
||||||
|
control compilation and installation of the executable. However, as a
|
||||||
|
special exception, the source code distributed need not include
|
||||||
|
anything that is normally distributed (in either source or binary
|
||||||
|
form) with the major components (compiler, kernel, and so on) of the
|
||||||
|
operating system on which the executable runs, unless that component
|
||||||
|
itself accompanies the executable.
|
||||||
|
|
||||||
|
If distribution of executable or object code is made by offering
|
||||||
|
access to copy from a designated place, then offering equivalent
|
||||||
|
access to copy the source code from the same place counts as
|
||||||
|
distribution of the source code, even though third parties are not
|
||||||
|
compelled to copy the source along with the object code.
|
||||||
|
|
||||||
|
4. You may not copy, modify, sublicense, or distribute the Program
|
||||||
|
except as expressly provided under this License. Any attempt
|
||||||
|
otherwise to copy, modify, sublicense or distribute the Program is
|
||||||
|
void, and will automatically terminate your rights under this License.
|
||||||
|
However, parties who have received copies, or rights, from you under
|
||||||
|
this License will not have their licenses terminated so long as such
|
||||||
|
parties remain in full compliance.
|
||||||
|
|
||||||
|
5. You are not required to accept this License, since you have not
|
||||||
|
signed it. However, nothing else grants you permission to modify or
|
||||||
|
distribute the Program or its derivative works. These actions are
|
||||||
|
prohibited by law if you do not accept this License. Therefore, by
|
||||||
|
modifying or distributing the Program (or any work based on the
|
||||||
|
Program), you indicate your acceptance of this License to do so, and
|
||||||
|
all its terms and conditions for copying, distributing or modifying
|
||||||
|
the Program or works based on it.
|
||||||
|
|
||||||
|
6. Each time you redistribute the Program (or any work based on the
|
||||||
|
Program), the recipient automatically receives a license from the
|
||||||
|
original licensor to copy, distribute or modify the Program subject to
|
||||||
|
these terms and conditions. You may not impose any further
|
||||||
|
restrictions on the recipients' exercise of the rights granted herein.
|
||||||
|
You are not responsible for enforcing compliance by third parties to
|
||||||
|
this License.
|
||||||
|
|
||||||
|
7. If, as a consequence of a court judgment or allegation of patent
|
||||||
|
infringement or for any other reason (not limited to patent issues),
|
||||||
|
conditions are imposed on you (whether by court order, agreement or
|
||||||
|
otherwise) that contradict the conditions of this License, they do not
|
||||||
|
excuse you from the conditions of this License. If you cannot
|
||||||
|
distribute so as to satisfy simultaneously your obligations under this
|
||||||
|
License and any other pertinent obligations, then as a consequence you
|
||||||
|
may not distribute the Program at all. For example, if a patent
|
||||||
|
license would not permit royalty-free redistribution of the Program by
|
||||||
|
all those who receive copies directly or indirectly through you, then
|
||||||
|
the only way you could satisfy both it and this License would be to
|
||||||
|
refrain entirely from distribution of the Program.
|
||||||
|
|
||||||
|
If any portion of this section is held invalid or unenforceable under
|
||||||
|
any particular circumstance, the balance of the section is intended to
|
||||||
|
apply and the section as a whole is intended to apply in other
|
||||||
|
circumstances.
|
||||||
|
|
||||||
|
It is not the purpose of this section to induce you to infringe any
|
||||||
|
patents or other property right claims or to contest validity of any
|
||||||
|
such claims; this section has the sole purpose of protecting the
|
||||||
|
integrity of the free software distribution system, which is
|
||||||
|
implemented by public license practices. Many people have made
|
||||||
|
generous contributions to the wide range of software distributed
|
||||||
|
through that system in reliance on consistent application of that
|
||||||
|
system; it is up to the author/donor to decide if he or she is willing
|
||||||
|
to distribute software through any other system and a licensee cannot
|
||||||
|
impose that choice.
|
||||||
|
|
||||||
|
This section is intended to make thoroughly clear what is believed to
|
||||||
|
be a consequence of the rest of this License.
|
||||||
|
|
||||||
|
8. If the distribution and/or use of the Program is restricted in
|
||||||
|
certain countries either by patents or by copyrighted interfaces, the
|
||||||
|
original copyright holder who places the Program under this License
|
||||||
|
may add an explicit geographical distribution limitation excluding
|
||||||
|
those countries, so that distribution is permitted only in or among
|
||||||
|
countries not thus excluded. In such case, this License incorporates
|
||||||
|
the limitation as if written in the body of this License.
|
||||||
|
|
||||||
|
9. The Free Software Foundation may publish revised and/or new versions
|
||||||
|
of the General Public License from time to time. Such new versions will
|
||||||
|
be similar in spirit to the present version, but may differ in detail to
|
||||||
|
address new problems or concerns.
|
||||||
|
|
||||||
|
Each version is given a distinguishing version number. If the Program
|
||||||
|
specifies a version number of this License which applies to it and "any
|
||||||
|
later version", you have the option of following the terms and conditions
|
||||||
|
either of that version or of any later version published by the Free
|
||||||
|
Software Foundation. If the Program does not specify a version number of
|
||||||
|
this License, you may choose any version ever published by the Free Software
|
||||||
|
Foundation.
|
||||||
|
|
||||||
|
10. If you wish to incorporate parts of the Program into other free
|
||||||
|
programs whose distribution conditions are different, write to the author
|
||||||
|
to ask for permission. For software which is copyrighted by the Free
|
||||||
|
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||||
|
make exceptions for this. Our decision will be guided by the two goals
|
||||||
|
of preserving the free status of all derivatives of our free software and
|
||||||
|
of promoting the sharing and reuse of software generally.
|
||||||
|
|
||||||
|
NO WARRANTY
|
||||||
|
|
||||||
|
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||||
|
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||||
|
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||||
|
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||||
|
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||||
|
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||||
|
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||||
|
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||||
|
REPAIR OR CORRECTION.
|
||||||
|
|
||||||
|
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||||
|
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||||
|
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||||
|
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||||
|
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||||
|
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||||
|
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||||
|
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||||
|
POSSIBILITY OF SUCH DAMAGES.
|
||||||
|
|
||||||
|
END OF TERMS AND CONDITIONS
|
||||||
|
|
||||||
|
How to Apply These Terms to Your New Programs
|
||||||
|
|
||||||
|
If you develop a new program, and you want it to be of the greatest
|
||||||
|
possible use to the public, the best way to achieve this is to make it
|
||||||
|
free software which everyone can redistribute and change under these terms.
|
||||||
|
|
||||||
|
To do so, attach the following notices to the program. It is safest
|
||||||
|
to attach them to the start of each source file to most effectively
|
||||||
|
convey the exclusion of warranty; and each file should have at least
|
||||||
|
the "copyright" line and a pointer to where the full notice is found.
|
||||||
|
|
||||||
|
{{description}}
|
||||||
|
Copyright (C) {{year}} {{fullname}}
|
||||||
|
|
||||||
|
This program is free software; you can redistribute it and/or modify
|
||||||
|
it under the terms of the GNU General Public License as published by
|
||||||
|
the Free Software Foundation; either version 2 of the License, or
|
||||||
|
(at your option) any later version.
|
||||||
|
|
||||||
|
This program is distributed in the hope that it will be useful,
|
||||||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
GNU General Public License for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU General Public License along
|
||||||
|
with this program; if not, write to the Free Software Foundation, Inc.,
|
||||||
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
||||||
|
|
||||||
|
Also add information on how to contact you by electronic and paper mail.
|
||||||
|
|
||||||
|
If the program is interactive, make it output a short notice like this
|
||||||
|
when it starts in an interactive mode:
|
||||||
|
|
||||||
|
Gnomovision version 69, Copyright (C) year name of author
|
||||||
|
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||||
|
This is free software, and you are welcome to redistribute it
|
||||||
|
under certain conditions; type `show c' for details.
|
||||||
|
|
||||||
|
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||||
|
parts of the General Public License. Of course, the commands you use may
|
||||||
|
be called something other than `show w' and `show c'; they could even be
|
||||||
|
mouse-clicks or menu items--whatever suits your program.
|
||||||
|
|
||||||
|
You should also get your employer (if you work as a programmer) or your
|
||||||
|
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||||
|
necessary. Here is a sample; alter the names:
|
||||||
|
|
||||||
|
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||||
|
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||||
|
|
||||||
|
{signature of Ty Coon}, 1 April 1989
|
||||||
|
Ty Coon, President of Vice
|
||||||
|
|
||||||
|
This General Public License does not permit incorporating your program into
|
||||||
|
proprietary programs. If your program is a subroutine library, you may
|
||||||
|
consider it more useful to permit linking proprietary applications with the
|
||||||
|
library. If this is what you want to do, use the GNU Lesser General
|
||||||
|
Public License instead of this License.
|
15
Makefile
Normal file
15
Makefile
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
docker_build_local:
|
||||||
|
docker build -t vreader:latest .
|
||||||
|
|
||||||
|
docker_build_release_dev:
|
||||||
|
docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
-t gitea.va.reichard.io/evan/vreader:dev \
|
||||||
|
--push .
|
||||||
|
|
||||||
|
docker_build_release_latest:
|
||||||
|
docker buildx build \
|
||||||
|
--platform linux/amd64,linux/arm64 \
|
||||||
|
-t gitea.va.reichard.io/evan/vreader:latest \
|
||||||
|
-t gitea.va.reichard.io/evan/vreader:`git describe --tags` \
|
||||||
|
--push .
|
46
README.md
Normal file
46
README.md
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
# VReader
|
||||||
|
|
||||||
|
Turn YouTube videos into articles! I banged this one out in a couple of hours, so it's a bit scrappy. Will slowly improve it.
|
||||||
|
|
||||||
|
## Running Server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Locally (See "Development" Section)
|
||||||
|
export OPENAI_API_KEY=`cat openai_key`
|
||||||
|
|
||||||
|
vreader server run
|
||||||
|
|
||||||
|
# Docker Quick Start
|
||||||
|
docker run \
|
||||||
|
-p 5000:5000 \
|
||||||
|
-e OPENAI_API_KEY=`cat openai_key` \
|
||||||
|
-e DATA_PATH=/data
|
||||||
|
-v ./data:/data \
|
||||||
|
gitea.va.reichard.io/evan/vreader:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
The server will now be accessible at `http://localhost:5000`
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
| Environment Variable | Default Value | Description |
|
||||||
|
| -------------------- | ------------- | ----------------------------------- |
|
||||||
|
| OPENAI_API_KEY | NONE | Required OpenAI API Key for ChatGPT |
|
||||||
|
| DATA_PATH | NONE | Where to store the data |
|
||||||
|
|
||||||
|
# Development
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Initiate
|
||||||
|
python3 -m venv venv
|
||||||
|
. ./venv/bin/activate
|
||||||
|
|
||||||
|
# Local Development
|
||||||
|
pip install -e .
|
||||||
|
|
||||||
|
# Creds & Other Environment Variables
|
||||||
|
export OPENAI_API_KEY=`cat openai_key`
|
||||||
|
|
||||||
|
# Docker
|
||||||
|
make docker_build_local
|
||||||
|
```
|
25
pyproject.toml
Normal file
25
pyproject.toml
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
[project]
|
||||||
|
name = "vreader"
|
||||||
|
version = "0.0.1"
|
||||||
|
description = "Turn videos into articles!"
|
||||||
|
authors = [
|
||||||
|
{ name = "Evan Reichard", email = "evan@reichard.io" },
|
||||||
|
]
|
||||||
|
license = { file = "LICENSE" }
|
||||||
|
readme = "README.md"
|
||||||
|
requires-python = ">=3.11"
|
||||||
|
dependencies = [
|
||||||
|
"Flask>=3.0",
|
||||||
|
"openai==0.28.1",
|
||||||
|
"openai[datalib]==0.28.1",
|
||||||
|
"click",
|
||||||
|
"yt-dlp",
|
||||||
|
"markdown",
|
||||||
|
"html-sanitizer"
|
||||||
|
]
|
||||||
|
|
||||||
|
[project.scripts]
|
||||||
|
vreader = "vreader:cli"
|
||||||
|
|
||||||
|
[tool.setuptools.packages]
|
||||||
|
find = {}
|
7
shell.nix
Normal file
7
shell.nix
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
{ pkgs ? import <nixpkgs> { } }:
|
||||||
|
|
||||||
|
pkgs.mkShell {
|
||||||
|
packages = with pkgs; [
|
||||||
|
python311
|
||||||
|
];
|
||||||
|
}
|
42
vreader/__init__.py
Normal file
42
vreader/__init__.py
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
import click
|
||||||
|
import signal
|
||||||
|
import sys
|
||||||
|
from importlib.metadata import version
|
||||||
|
from vreader.oai import OpenAIConnector
|
||||||
|
from vreader.video import VideoManager
|
||||||
|
from flask import Flask
|
||||||
|
from flask.cli import FlaskGroup
|
||||||
|
|
||||||
|
__version__ = version("vreader")
|
||||||
|
|
||||||
|
def signal_handler(sig, frame):
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
|
|
||||||
|
def create_app():
|
||||||
|
global oai, vman
|
||||||
|
|
||||||
|
from vreader.config import Config
|
||||||
|
import vreader.api.common as api_common
|
||||||
|
import vreader.api.v1 as api_v1
|
||||||
|
|
||||||
|
app = Flask(__name__)
|
||||||
|
oai = OpenAIConnector(Config.OPENAI_API_KEY)
|
||||||
|
vman = VideoManager()
|
||||||
|
|
||||||
|
app.register_blueprint(api_common.bp)
|
||||||
|
app.register_blueprint(api_v1.bp)
|
||||||
|
|
||||||
|
return app
|
||||||
|
|
||||||
|
|
||||||
|
@click.group()
|
||||||
|
def cli():
|
||||||
|
"""VReader CLI"""
|
||||||
|
|
||||||
|
|
||||||
|
@cli.group(cls=FlaskGroup, create_app=create_app)
|
||||||
|
def server():
|
||||||
|
"""VReader flask server"""
|
||||||
|
|
||||||
|
signal.signal(signal.SIGINT, signal_handler)
|
64
vreader/api/common.py
Normal file
64
vreader/api/common.py
Normal file
@ -0,0 +1,64 @@
|
|||||||
|
from flask import Blueprint
|
||||||
|
from flask import make_response, render_template
|
||||||
|
from html_sanitizer import Sanitizer
|
||||||
|
from markdown import markdown
|
||||||
|
from vreader.config import Config
|
||||||
|
import os
|
||||||
|
|
||||||
|
bp = Blueprint("common", __name__)
|
||||||
|
sanitizer = Sanitizer()
|
||||||
|
|
||||||
|
@bp.route("/", methods=["GET"])
|
||||||
|
def main_entry():
|
||||||
|
|
||||||
|
directory = str(Config.DATA_PATH)
|
||||||
|
|
||||||
|
all_files = os.listdir(directory)
|
||||||
|
markdown_files = [file for file in all_files if file.endswith(".md")]
|
||||||
|
articles = [parse_filename(file) for file in markdown_files]
|
||||||
|
|
||||||
|
return make_response(render_template("index.html", articles=articles))
|
||||||
|
|
||||||
|
@bp.route("/articles/<id>", methods=["GET"])
|
||||||
|
def article_item(id):
|
||||||
|
|
||||||
|
if len(id) != 11:
|
||||||
|
return make_response(render_template("404.html")), 404
|
||||||
|
|
||||||
|
metadata = get_article_metadata(id)
|
||||||
|
if not metadata:
|
||||||
|
return make_response(render_template("404.html")), 404
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(metadata["filepath"], 'r', encoding='utf-8') as file:
|
||||||
|
article_contents = file.read()
|
||||||
|
|
||||||
|
markdown_html = sanitizer.sanitize(markdown(article_contents))
|
||||||
|
|
||||||
|
return make_response(
|
||||||
|
render_template("article.html", metadata=metadata, markdown_html=markdown_html)
|
||||||
|
)
|
||||||
|
except Exception as _:
|
||||||
|
return make_response(render_template("404.html")), 404
|
||||||
|
|
||||||
|
|
||||||
|
def get_article_metadata(id):
|
||||||
|
directory = str(Config.DATA_PATH)
|
||||||
|
files = os.listdir(directory)
|
||||||
|
for file_name in files:
|
||||||
|
if file_name.startswith(id) and file_name.endswith(".md"):
|
||||||
|
file_path = os.path.join(directory, file_name)
|
||||||
|
metadata = parse_filename(file_name)
|
||||||
|
metadata["filepath"] = file_path
|
||||||
|
return metadata
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_filename(filename):
|
||||||
|
video_id = filename[:11]
|
||||||
|
title = filename[12:][:-3]
|
||||||
|
|
||||||
|
return {
|
||||||
|
"video_id": video_id,
|
||||||
|
"title": title
|
||||||
|
}
|
78
vreader/api/v1.py
Normal file
78
vreader/api/v1.py
Normal file
@ -0,0 +1,78 @@
|
|||||||
|
import os
|
||||||
|
from os import path
|
||||||
|
from flask import Blueprint, request
|
||||||
|
from vreader.config import Config
|
||||||
|
import vreader
|
||||||
|
|
||||||
|
bp = Blueprint("v1", __name__, url_prefix="/api/v1")
|
||||||
|
|
||||||
|
@bp.route("/articles", methods=["GET"])
|
||||||
|
def articles():
|
||||||
|
directory = str(Config.DATA_PATH)
|
||||||
|
|
||||||
|
all_files = os.listdir(directory)
|
||||||
|
markdown_files = [file for file in all_files if file.endswith(".md")]
|
||||||
|
articles = [parse_filename(file) for file in markdown_files]
|
||||||
|
|
||||||
|
return articles
|
||||||
|
|
||||||
|
@bp.route("/generate", methods=["POST"])
|
||||||
|
def generate():
|
||||||
|
data = request.get_json()
|
||||||
|
if not data:
|
||||||
|
return {"error": "Missing Data"}
|
||||||
|
|
||||||
|
video = str(data.get("video"))
|
||||||
|
if video == "":
|
||||||
|
return {"error": "Missing Data"}
|
||||||
|
|
||||||
|
if len(video) != 11:
|
||||||
|
return {"error": "Invalid VideoID"}
|
||||||
|
|
||||||
|
metadata = get_article_metadata(video)
|
||||||
|
if metadata is not None:
|
||||||
|
return {"video": video}
|
||||||
|
|
||||||
|
context = vreader.vman.transcribe_video(video)
|
||||||
|
if context is None:
|
||||||
|
return {"error": "Unable to Extract Subtitles"}
|
||||||
|
|
||||||
|
resp = vreader.oai.query(context)
|
||||||
|
|
||||||
|
# Get Details
|
||||||
|
directory = str(Config.DATA_PATH)
|
||||||
|
title = resp.get("title")
|
||||||
|
content = resp.get("content")
|
||||||
|
|
||||||
|
# Derive Filename
|
||||||
|
new_title = f"{video}_{title}"
|
||||||
|
file_path = path.join(directory, f"{new_title}.md")
|
||||||
|
|
||||||
|
# Write File
|
||||||
|
file = open(file_path, 'w', encoding='utf-8')
|
||||||
|
file.write(content)
|
||||||
|
file.close()
|
||||||
|
|
||||||
|
return { "title": resp["title"] }
|
||||||
|
|
||||||
|
|
||||||
|
def get_article_metadata(id):
|
||||||
|
directory = str(Config.DATA_PATH)
|
||||||
|
files = os.listdir(directory)
|
||||||
|
for file_name in files:
|
||||||
|
if file_name.startswith(id) and file_name.endswith(".md"):
|
||||||
|
file_path = os.path.join(directory, file_name)
|
||||||
|
metadata = parse_filename(file_name)
|
||||||
|
metadata["filepath"] = file_path
|
||||||
|
return metadata
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_filename(filename):
|
||||||
|
video_id = filename[:11]
|
||||||
|
title = filename[12:][:-3]
|
||||||
|
|
||||||
|
return {
|
||||||
|
"video_id": video_id,
|
||||||
|
"title": title
|
||||||
|
}
|
24
vreader/config.py
Normal file
24
vreader/config.py
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
import os
|
||||||
|
|
||||||
|
|
||||||
|
def get_env(key, default=None, required=False) -> str | None:
|
||||||
|
"""Wrapper for gathering env vars."""
|
||||||
|
if required:
|
||||||
|
assert key in os.environ, "Missing Environment Variable: %s" % key
|
||||||
|
env = os.environ.get(key, default)
|
||||||
|
return str(env) if env is not None else None
|
||||||
|
|
||||||
|
|
||||||
|
class Config:
|
||||||
|
"""Wrap application configurations
|
||||||
|
|
||||||
|
Attributes
|
||||||
|
----------
|
||||||
|
DATA_PATH : str
|
||||||
|
The path where to store any resources (default: ./)
|
||||||
|
OPENAI_API_KEY : str
|
||||||
|
OpenAI API Key - Required
|
||||||
|
"""
|
||||||
|
|
||||||
|
DATA_PATH: str | None = get_env("DATA_PATH", required=False)
|
||||||
|
OPENAI_API_KEY: str | None = get_env("OPENAI_API_KEY", required=True)
|
67
vreader/oai.py
Normal file
67
vreader/oai.py
Normal file
@ -0,0 +1,67 @@
|
|||||||
|
from dataclasses import dataclass
|
||||||
|
from textwrap import indent
|
||||||
|
from typing import Any, List
|
||||||
|
import json
|
||||||
|
import openai
|
||||||
|
|
||||||
|
INITIAL_PROMPT_TEMPLATE = """
|
||||||
|
The following is a video transcription. Write a fully comprehensive article in markdown appropriately utilizing subsections. Be sure to only use the following transcription to write the article:
|
||||||
|
|
||||||
|
{context}
|
||||||
|
"""
|
||||||
|
|
||||||
|
INITIAL_PROMPT_TEMPLATE_OLD = """
|
||||||
|
The following is a video transcription. Write a comprehensive article in markdown utilizing the following content:
|
||||||
|
|
||||||
|
{context}
|
||||||
|
"""
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ChatCompletion:
|
||||||
|
id: str
|
||||||
|
object: str
|
||||||
|
created: int
|
||||||
|
model: str
|
||||||
|
choices: List[dict]
|
||||||
|
usage: dict
|
||||||
|
|
||||||
|
|
||||||
|
class OpenAIConnector:
|
||||||
|
def __init__(self, api_key: str | None):
|
||||||
|
if api_key is None:
|
||||||
|
raise RuntimeError("OPENAI_API_KEY Required")
|
||||||
|
|
||||||
|
# self.model = "gpt-3.5-turbo-16k"
|
||||||
|
self.model = "gpt-3.5-turbo-1106"
|
||||||
|
self.word_cap = 1000
|
||||||
|
openai.api_key = api_key
|
||||||
|
|
||||||
|
|
||||||
|
def query(self, context: str) -> Any:
|
||||||
|
# Create Initial Prompt
|
||||||
|
prompt = INITIAL_PROMPT_TEMPLATE.format(context = context)
|
||||||
|
messages = [{"role": "user", "content": prompt}]
|
||||||
|
|
||||||
|
print("[OpenAIConnector] Running OAI Query")
|
||||||
|
|
||||||
|
# Article Call
|
||||||
|
response: ChatCompletion = openai.ChatCompletion.create( # type: ignore
|
||||||
|
model=self.model,
|
||||||
|
messages=messages
|
||||||
|
)
|
||||||
|
|
||||||
|
# Markdown Data
|
||||||
|
content = response.choices[0]["message"]["content"]
|
||||||
|
title = self.get_title(content)
|
||||||
|
|
||||||
|
print("[OpenAIConnector] Completed OAI Query:\n", indent(json.dumps({ "usage": response.usage }, indent=2), ' ' * 2))
|
||||||
|
|
||||||
|
# Return Response
|
||||||
|
return { "title": title, "content": content }
|
||||||
|
|
||||||
|
def get_title(self, markdown: str):
|
||||||
|
lines = markdown.split('\n')
|
||||||
|
for line in lines:
|
||||||
|
if line.startswith("# "):
|
||||||
|
return line.strip("# ").strip()
|
||||||
|
return None
|
15
vreader/templates/404.html
Normal file
15
vreader/templates/404.html
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta
|
||||||
|
name="viewport"
|
||||||
|
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||||
|
/>
|
||||||
|
<title>VReader - Article</title>
|
||||||
|
<script src="https://cdn.tailwindcss.com"></script>
|
||||||
|
</head>
|
||||||
|
<body class="bg-slate-200 h-[100dvh] p-5 flex flex-col justify-between">
|
||||||
|
{{ markdown_html|safe }}
|
||||||
|
</body>
|
||||||
|
</html>
|
48
vreader/templates/article.html
Normal file
48
vreader/templates/article.html
Normal file
@ -0,0 +1,48 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta
|
||||||
|
name="viewport"
|
||||||
|
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||||
|
/>
|
||||||
|
<title>VReader - {{ metadata.title }}</title>
|
||||||
|
<script src="https://cdn.tailwindcss.com"></script>
|
||||||
|
<style>
|
||||||
|
#content {
|
||||||
|
h1 {
|
||||||
|
font-size: 1.75em;
|
||||||
|
font-weight: 400;
|
||||||
|
}
|
||||||
|
h2 {
|
||||||
|
font-size: 1.25em;
|
||||||
|
}
|
||||||
|
p {
|
||||||
|
margin-top: 0.25em;
|
||||||
|
margin-bottom: 1.5em;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body class="bg-slate-200">
|
||||||
|
<header class="w-screen h-16 bg-slate-300 mb-5">
|
||||||
|
<div
|
||||||
|
class="flex px-2 h-16 w-11/12 md:w-5/6 mx-auto rounded bg-slate-300"
|
||||||
|
>
|
||||||
|
<a class="font-bold flex justify-center items-center" href="/">All Articles</a>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
<div
|
||||||
|
id="content"
|
||||||
|
class="w-11/12 md:w-5/6 mx-auto rounded px-10 py-5 bg-slate-300"
|
||||||
|
>
|
||||||
|
<div class="flex justify-center pb-5 w-full">
|
||||||
|
<a target="_blank" href="https://www.youtube.com/watch?v={{ metadata.video_id }}">
|
||||||
|
<img class="h-32 rounded" src="https://i.ytimg.com/vi_webp/{{ metadata.video_id }}/maxresdefault.webp"></img>
|
||||||
|
</a>
|
||||||
|
</div>
|
||||||
|
<hr class="border-gray-500 pb-5" />
|
||||||
|
{{ markdown_html|safe }}
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
152
vreader/templates/index.html
Normal file
152
vreader/templates/index.html
Normal file
@ -0,0 +1,152 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
<meta
|
||||||
|
name="viewport"
|
||||||
|
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||||
|
/>
|
||||||
|
<title>VReader - Home</title>
|
||||||
|
<script src="https://cdn.tailwindcss.com"></script>
|
||||||
|
</head>
|
||||||
|
<body class="bg-slate-200">
|
||||||
|
<header class="w-screen h-16 bg-slate-300 mb-5">
|
||||||
|
<div
|
||||||
|
class="flex px-2 h-16 w-11/12 md:w-5/6 mx-auto rounded bg-slate-300"
|
||||||
|
>
|
||||||
|
<span class="font-bold flex justify-center items-center">VReader</span>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<main class="flex flex-col gap-4">
|
||||||
|
<div id="submit"
|
||||||
|
class="flex gap-4 items-center text-lg w-11/12 md:w-4/6 mx-auto rounded px-6 py-3 bg-slate-300"
|
||||||
|
>
|
||||||
|
<input type="text" placeholder="YouTube URL" class="w-full p-2 bg-gray-300 text-black dark:bg-gray-700 dark:text-white">
|
||||||
|
<button class="p-2 text-white bg-gray-500 dark:text-gray-800 hover:bg-gray-800 dark:hover:bg-gray-100" type="submit">Generate</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{% for article in articles %}
|
||||||
|
<a
|
||||||
|
href="/articles/{{ article.video_id }}"
|
||||||
|
class="flex items-center text-lg w-11/12 md:w-4/6 mx-auto rounded px-6 py-3 bg-slate-300 hover:bg-slate-400 transition-all duration-200"
|
||||||
|
>
|
||||||
|
<img class="h-14 md:h-24 mr-6 rounded" src="https://i.ytimg.com/vi_webp/{{ article.video_id }}/maxresdefault.webp"></img>
|
||||||
|
<span>{{ article.title }}</span>
|
||||||
|
</a>
|
||||||
|
{% endfor %}
|
||||||
|
</main>
|
||||||
|
<script>
|
||||||
|
const LOADING_SVG = `<svg
|
||||||
|
class="w-full"
|
||||||
|
width="24"
|
||||||
|
height="24"
|
||||||
|
viewBox="0 0 24 24"
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
fill="currentColor"
|
||||||
|
>
|
||||||
|
<style>
|
||||||
|
.spinner_qM83 {
|
||||||
|
animation: spinner_8HQG 1.05s infinite;
|
||||||
|
}
|
||||||
|
.spinner_oXPr {
|
||||||
|
animation-delay: 0.1s;
|
||||||
|
}
|
||||||
|
.spinner_ZTLf {
|
||||||
|
animation-delay: 0.2s;
|
||||||
|
}
|
||||||
|
@keyframes spinner_8HQG {
|
||||||
|
0%,
|
||||||
|
57.14% {
|
||||||
|
animation-timing-function: cubic-bezier(0.33, 0.66, 0.66, 1);
|
||||||
|
transform: translate(0);
|
||||||
|
}
|
||||||
|
28.57% {
|
||||||
|
animation-timing-function: cubic-bezier(0.33, 0, 0.66, 0.33);
|
||||||
|
transform: translateY(-6px);
|
||||||
|
}
|
||||||
|
100% {
|
||||||
|
transform: translate(0);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
<circle class="spinner_qM83" cx="4" cy="12" r="3"></circle>
|
||||||
|
<circle class="spinner_qM83 spinner_oXPr" cx="12" cy="12" r="3"></circle>
|
||||||
|
<circle class="spinner_qM83 spinner_ZTLf" cx="20" cy="12" r="3"></circle>
|
||||||
|
</svg>`;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Wrapper API Call
|
||||||
|
**/
|
||||||
|
function apiCall(data) {
|
||||||
|
let fetchObj = {
|
||||||
|
method: data.method || "GET",
|
||||||
|
headers: {
|
||||||
|
"Content-Type": "application/json",
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
if (fetchObj.method == "POST")
|
||||||
|
fetchObj.body = JSON.stringify(data.data || {});
|
||||||
|
|
||||||
|
return fetch(data.url, fetchObj).then((resp) => resp.json());
|
||||||
|
}
|
||||||
|
|
||||||
|
function getVideoArticle(videoID) {
|
||||||
|
return apiCall({
|
||||||
|
url: "/api/v1/generate",
|
||||||
|
method: "POST",
|
||||||
|
data: { video: videoID },
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function generateAction(){
|
||||||
|
let inputEl = document.querySelector("input");
|
||||||
|
let inputVal = inputEl.value;
|
||||||
|
let videoID = getYouTubeVideoId(inputVal);
|
||||||
|
if (!videoID) return alert("Invalid URL")
|
||||||
|
|
||||||
|
// Loading
|
||||||
|
let submitEl = document.querySelector("#submit");
|
||||||
|
let oldHTML = submitEl.innerHTML;
|
||||||
|
submitEl.innerHTML = LOADING_SVG;
|
||||||
|
|
||||||
|
// Do API Call
|
||||||
|
apiCall({
|
||||||
|
url: "/api/v1/generate",
|
||||||
|
method: "POST",
|
||||||
|
data: { video: videoID },
|
||||||
|
}).then((resp) => {
|
||||||
|
if ("error" in resp) throw new Error(resp.error);
|
||||||
|
window.location.href = "/articles/" + videoID;
|
||||||
|
}).catch(e => {
|
||||||
|
console.log(e);
|
||||||
|
alert(e.message);
|
||||||
|
submitEl.innerHTML = oldHTML;
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function initListeners(){
|
||||||
|
let buttonEl = document.querySelector("button");
|
||||||
|
let inputEl = document.querySelector("input");
|
||||||
|
buttonEl.addEventListener("click", generateAction);
|
||||||
|
inputEl.addEventListener("keydown", function(event) {
|
||||||
|
if (event.keyCode !== 13) return;
|
||||||
|
generateAction();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function getYouTubeVideoId(url) {
|
||||||
|
var regExp = /^.*(?:youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*).*/;
|
||||||
|
var match = url.match(regExp);
|
||||||
|
if (match && match[1]) {
|
||||||
|
return match[1];
|
||||||
|
} else {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
initListeners();
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>
|
42
vreader/video.py
Normal file
42
vreader/video.py
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
import os
|
||||||
|
from yt_dlp import YoutubeDL
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
|
||||||
|
class VideoManager():
|
||||||
|
"""Transcribe Videos"""
|
||||||
|
|
||||||
|
def transcribe_video(self, video_id: str):
|
||||||
|
URLS = [video_id]
|
||||||
|
|
||||||
|
vid = YoutubeDL({
|
||||||
|
"skip_download": True,
|
||||||
|
"writesubtitles": True,
|
||||||
|
"writeautomaticsub": True,
|
||||||
|
"subtitleslangs": ["en"],
|
||||||
|
"subtitlesformat": "ttml",
|
||||||
|
"outtmpl": "transcript"
|
||||||
|
})
|
||||||
|
|
||||||
|
vid.download(URLS)
|
||||||
|
content = self.convert_ttml_to_plain_text("transcript.en.ttml")
|
||||||
|
os.remove("transcript.en.ttml")
|
||||||
|
|
||||||
|
return content
|
||||||
|
|
||||||
|
|
||||||
|
def convert_ttml_to_plain_text(self, ttml_file_path):
|
||||||
|
try:
|
||||||
|
# Parse the TTML file
|
||||||
|
tree = ET.parse(ttml_file_path)
|
||||||
|
root = tree.getroot()
|
||||||
|
|
||||||
|
# Process Text
|
||||||
|
plain_text = ""
|
||||||
|
for elem in root.iter():
|
||||||
|
if elem.text:
|
||||||
|
plain_text += elem.text + " "
|
||||||
|
|
||||||
|
return plain_text.strip()
|
||||||
|
except ET.ParseError as e:
|
||||||
|
print("[VideoManager] TTML Conversion Error:", e)
|
||||||
|
return None
|
Loading…
Reference in New Issue
Block a user