Initial Commit
This commit is contained in:
commit
87406a2d43
7
.gitignore
vendored
Normal file
7
.gitignore
vendored
Normal file
@ -0,0 +1,7 @@
|
||||
__pycache__
|
||||
.DS_Store
|
||||
.direnv
|
||||
data
|
||||
venv
|
||||
openai_key
|
||||
vreader.egg-info/
|
18
Dockerfile
Normal file
18
Dockerfile
Normal file
@ -0,0 +1,18 @@
|
||||
# Build Container
|
||||
FROM python:3.11-slim
|
||||
|
||||
# Install App
|
||||
WORKDIR /app
|
||||
COPY . /app
|
||||
|
||||
# Install App & Gunicorn
|
||||
RUN pip install .
|
||||
RUN pip3 install gunicorn
|
||||
|
||||
# Cleanup
|
||||
RUN rm -rf /app
|
||||
|
||||
# Start Application
|
||||
ENTRYPOINT ["gunicorn"]
|
||||
EXPOSE 5000
|
||||
CMD ["vreader:create_app()", "--bind", "0.0.0.0:5000", "--threads=4", "--access-logfile", "-"]
|
339
LICENSE
Normal file
339
LICENSE
Normal file
@ -0,0 +1,339 @@
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Lesser General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
{{description}}
|
||||
Copyright (C) {{year}} {{fullname}}
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License along
|
||||
with this program; if not, write to the Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) year name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, the commands you use may
|
||||
be called something other than `show w' and `show c'; they could even be
|
||||
mouse-clicks or menu items--whatever suits your program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||
|
||||
{signature of Ty Coon}, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
This General Public License does not permit incorporating your program into
|
||||
proprietary programs. If your program is a subroutine library, you may
|
||||
consider it more useful to permit linking proprietary applications with the
|
||||
library. If this is what you want to do, use the GNU Lesser General
|
||||
Public License instead of this License.
|
15
Makefile
Normal file
15
Makefile
Normal file
@ -0,0 +1,15 @@
|
||||
docker_build_local:
|
||||
docker build -t vreader:latest .
|
||||
|
||||
docker_build_release_dev:
|
||||
docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
-t gitea.va.reichard.io/evan/vreader:dev \
|
||||
--push .
|
||||
|
||||
docker_build_release_latest:
|
||||
docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
-t gitea.va.reichard.io/evan/vreader:latest \
|
||||
-t gitea.va.reichard.io/evan/vreader:`git describe --tags` \
|
||||
--push .
|
46
README.md
Normal file
46
README.md
Normal file
@ -0,0 +1,46 @@
|
||||
# VReader
|
||||
|
||||
Turn YouTube videos into articles! I banged this one out in a couple of hours, so it's a bit scrappy. Will slowly improve it.
|
||||
|
||||
## Running Server
|
||||
|
||||
```bash
|
||||
# Locally (See "Development" Section)
|
||||
export OPENAI_API_KEY=`cat openai_key`
|
||||
|
||||
vreader server run
|
||||
|
||||
# Docker Quick Start
|
||||
docker run \
|
||||
-p 5000:5000 \
|
||||
-e OPENAI_API_KEY=`cat openai_key` \
|
||||
-e DATA_PATH=/data
|
||||
-v ./data:/data \
|
||||
gitea.va.reichard.io/evan/vreader:latest
|
||||
```
|
||||
|
||||
The server will now be accessible at `http://localhost:5000`
|
||||
|
||||
## Configuration
|
||||
|
||||
| Environment Variable | Default Value | Description |
|
||||
| -------------------- | ------------- | ----------------------------------- |
|
||||
| OPENAI_API_KEY | NONE | Required OpenAI API Key for ChatGPT |
|
||||
| DATA_PATH | NONE | Where to store the data |
|
||||
|
||||
# Development
|
||||
|
||||
```bash
|
||||
# Initiate
|
||||
python3 -m venv venv
|
||||
. ./venv/bin/activate
|
||||
|
||||
# Local Development
|
||||
pip install -e .
|
||||
|
||||
# Creds & Other Environment Variables
|
||||
export OPENAI_API_KEY=`cat openai_key`
|
||||
|
||||
# Docker
|
||||
make docker_build_local
|
||||
```
|
25
pyproject.toml
Normal file
25
pyproject.toml
Normal file
@ -0,0 +1,25 @@
|
||||
[project]
|
||||
name = "vreader"
|
||||
version = "0.0.1"
|
||||
description = "Turn videos into articles!"
|
||||
authors = [
|
||||
{ name = "Evan Reichard", email = "evan@reichard.io" },
|
||||
]
|
||||
license = { file = "LICENSE" }
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"Flask>=3.0",
|
||||
"openai==0.28.1",
|
||||
"openai[datalib]==0.28.1",
|
||||
"click",
|
||||
"yt-dlp",
|
||||
"markdown",
|
||||
"html-sanitizer"
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
vreader = "vreader:cli"
|
||||
|
||||
[tool.setuptools.packages]
|
||||
find = {}
|
7
shell.nix
Normal file
7
shell.nix
Normal file
@ -0,0 +1,7 @@
|
||||
{ pkgs ? import <nixpkgs> { } }:
|
||||
|
||||
pkgs.mkShell {
|
||||
packages = with pkgs; [
|
||||
python311
|
||||
];
|
||||
}
|
42
vreader/__init__.py
Normal file
42
vreader/__init__.py
Normal file
@ -0,0 +1,42 @@
|
||||
import click
|
||||
import signal
|
||||
import sys
|
||||
from importlib.metadata import version
|
||||
from vreader.oai import OpenAIConnector
|
||||
from vreader.video import VideoManager
|
||||
from flask import Flask
|
||||
from flask.cli import FlaskGroup
|
||||
|
||||
__version__ = version("vreader")
|
||||
|
||||
def signal_handler(sig, frame):
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
def create_app():
|
||||
global oai, vman
|
||||
|
||||
from vreader.config import Config
|
||||
import vreader.api.common as api_common
|
||||
import vreader.api.v1 as api_v1
|
||||
|
||||
app = Flask(__name__)
|
||||
oai = OpenAIConnector(Config.OPENAI_API_KEY)
|
||||
vman = VideoManager()
|
||||
|
||||
app.register_blueprint(api_common.bp)
|
||||
app.register_blueprint(api_v1.bp)
|
||||
|
||||
return app
|
||||
|
||||
|
||||
@click.group()
|
||||
def cli():
|
||||
"""VReader CLI"""
|
||||
|
||||
|
||||
@cli.group(cls=FlaskGroup, create_app=create_app)
|
||||
def server():
|
||||
"""VReader flask server"""
|
||||
|
||||
signal.signal(signal.SIGINT, signal_handler)
|
64
vreader/api/common.py
Normal file
64
vreader/api/common.py
Normal file
@ -0,0 +1,64 @@
|
||||
from flask import Blueprint
|
||||
from flask import make_response, render_template
|
||||
from html_sanitizer import Sanitizer
|
||||
from markdown import markdown
|
||||
from vreader.config import Config
|
||||
import os
|
||||
|
||||
bp = Blueprint("common", __name__)
|
||||
sanitizer = Sanitizer()
|
||||
|
||||
@bp.route("/", methods=["GET"])
|
||||
def main_entry():
|
||||
|
||||
directory = str(Config.DATA_PATH)
|
||||
|
||||
all_files = os.listdir(directory)
|
||||
markdown_files = [file for file in all_files if file.endswith(".md")]
|
||||
articles = [parse_filename(file) for file in markdown_files]
|
||||
|
||||
return make_response(render_template("index.html", articles=articles))
|
||||
|
||||
@bp.route("/articles/<id>", methods=["GET"])
|
||||
def article_item(id):
|
||||
|
||||
if len(id) != 11:
|
||||
return make_response(render_template("404.html")), 404
|
||||
|
||||
metadata = get_article_metadata(id)
|
||||
if not metadata:
|
||||
return make_response(render_template("404.html")), 404
|
||||
|
||||
try:
|
||||
with open(metadata["filepath"], 'r', encoding='utf-8') as file:
|
||||
article_contents = file.read()
|
||||
|
||||
markdown_html = sanitizer.sanitize(markdown(article_contents))
|
||||
|
||||
return make_response(
|
||||
render_template("article.html", metadata=metadata, markdown_html=markdown_html)
|
||||
)
|
||||
except Exception as _:
|
||||
return make_response(render_template("404.html")), 404
|
||||
|
||||
|
||||
def get_article_metadata(id):
|
||||
directory = str(Config.DATA_PATH)
|
||||
files = os.listdir(directory)
|
||||
for file_name in files:
|
||||
if file_name.startswith(id) and file_name.endswith(".md"):
|
||||
file_path = os.path.join(directory, file_name)
|
||||
metadata = parse_filename(file_name)
|
||||
metadata["filepath"] = file_path
|
||||
return metadata
|
||||
return None
|
||||
|
||||
|
||||
def parse_filename(filename):
|
||||
video_id = filename[:11]
|
||||
title = filename[12:][:-3]
|
||||
|
||||
return {
|
||||
"video_id": video_id,
|
||||
"title": title
|
||||
}
|
78
vreader/api/v1.py
Normal file
78
vreader/api/v1.py
Normal file
@ -0,0 +1,78 @@
|
||||
import os
|
||||
from os import path
|
||||
from flask import Blueprint, request
|
||||
from vreader.config import Config
|
||||
import vreader
|
||||
|
||||
bp = Blueprint("v1", __name__, url_prefix="/api/v1")
|
||||
|
||||
@bp.route("/articles", methods=["GET"])
|
||||
def articles():
|
||||
directory = str(Config.DATA_PATH)
|
||||
|
||||
all_files = os.listdir(directory)
|
||||
markdown_files = [file for file in all_files if file.endswith(".md")]
|
||||
articles = [parse_filename(file) for file in markdown_files]
|
||||
|
||||
return articles
|
||||
|
||||
@bp.route("/generate", methods=["POST"])
|
||||
def generate():
|
||||
data = request.get_json()
|
||||
if not data:
|
||||
return {"error": "Missing Data"}
|
||||
|
||||
video = str(data.get("video"))
|
||||
if video == "":
|
||||
return {"error": "Missing Data"}
|
||||
|
||||
if len(video) != 11:
|
||||
return {"error": "Invalid VideoID"}
|
||||
|
||||
metadata = get_article_metadata(video)
|
||||
if metadata is not None:
|
||||
return {"video": video}
|
||||
|
||||
context = vreader.vman.transcribe_video(video)
|
||||
if context is None:
|
||||
return {"error": "Unable to Extract Subtitles"}
|
||||
|
||||
resp = vreader.oai.query(context)
|
||||
|
||||
# Get Details
|
||||
directory = str(Config.DATA_PATH)
|
||||
title = resp.get("title")
|
||||
content = resp.get("content")
|
||||
|
||||
# Derive Filename
|
||||
new_title = f"{video}_{title}"
|
||||
file_path = path.join(directory, f"{new_title}.md")
|
||||
|
||||
# Write File
|
||||
file = open(file_path, 'w', encoding='utf-8')
|
||||
file.write(content)
|
||||
file.close()
|
||||
|
||||
return { "title": resp["title"] }
|
||||
|
||||
|
||||
def get_article_metadata(id):
|
||||
directory = str(Config.DATA_PATH)
|
||||
files = os.listdir(directory)
|
||||
for file_name in files:
|
||||
if file_name.startswith(id) and file_name.endswith(".md"):
|
||||
file_path = os.path.join(directory, file_name)
|
||||
metadata = parse_filename(file_name)
|
||||
metadata["filepath"] = file_path
|
||||
return metadata
|
||||
return None
|
||||
|
||||
|
||||
def parse_filename(filename):
|
||||
video_id = filename[:11]
|
||||
title = filename[12:][:-3]
|
||||
|
||||
return {
|
||||
"video_id": video_id,
|
||||
"title": title
|
||||
}
|
24
vreader/config.py
Normal file
24
vreader/config.py
Normal file
@ -0,0 +1,24 @@
|
||||
import os
|
||||
|
||||
|
||||
def get_env(key, default=None, required=False) -> str | None:
|
||||
"""Wrapper for gathering env vars."""
|
||||
if required:
|
||||
assert key in os.environ, "Missing Environment Variable: %s" % key
|
||||
env = os.environ.get(key, default)
|
||||
return str(env) if env is not None else None
|
||||
|
||||
|
||||
class Config:
|
||||
"""Wrap application configurations
|
||||
|
||||
Attributes
|
||||
----------
|
||||
DATA_PATH : str
|
||||
The path where to store any resources (default: ./)
|
||||
OPENAI_API_KEY : str
|
||||
OpenAI API Key - Required
|
||||
"""
|
||||
|
||||
DATA_PATH: str | None = get_env("DATA_PATH", required=False)
|
||||
OPENAI_API_KEY: str | None = get_env("OPENAI_API_KEY", required=True)
|
67
vreader/oai.py
Normal file
67
vreader/oai.py
Normal file
@ -0,0 +1,67 @@
|
||||
from dataclasses import dataclass
|
||||
from textwrap import indent
|
||||
from typing import Any, List
|
||||
import json
|
||||
import openai
|
||||
|
||||
INITIAL_PROMPT_TEMPLATE = """
|
||||
The following is a video transcription. Write a fully comprehensive article in markdown appropriately utilizing subsections. Be sure to only use the following transcription to write the article:
|
||||
|
||||
{context}
|
||||
"""
|
||||
|
||||
INITIAL_PROMPT_TEMPLATE_OLD = """
|
||||
The following is a video transcription. Write a comprehensive article in markdown utilizing the following content:
|
||||
|
||||
{context}
|
||||
"""
|
||||
|
||||
@dataclass
|
||||
class ChatCompletion:
|
||||
id: str
|
||||
object: str
|
||||
created: int
|
||||
model: str
|
||||
choices: List[dict]
|
||||
usage: dict
|
||||
|
||||
|
||||
class OpenAIConnector:
|
||||
def __init__(self, api_key: str | None):
|
||||
if api_key is None:
|
||||
raise RuntimeError("OPENAI_API_KEY Required")
|
||||
|
||||
# self.model = "gpt-3.5-turbo-16k"
|
||||
self.model = "gpt-3.5-turbo-1106"
|
||||
self.word_cap = 1000
|
||||
openai.api_key = api_key
|
||||
|
||||
|
||||
def query(self, context: str) -> Any:
|
||||
# Create Initial Prompt
|
||||
prompt = INITIAL_PROMPT_TEMPLATE.format(context = context)
|
||||
messages = [{"role": "user", "content": prompt}]
|
||||
|
||||
print("[OpenAIConnector] Running OAI Query")
|
||||
|
||||
# Article Call
|
||||
response: ChatCompletion = openai.ChatCompletion.create( # type: ignore
|
||||
model=self.model,
|
||||
messages=messages
|
||||
)
|
||||
|
||||
# Markdown Data
|
||||
content = response.choices[0]["message"]["content"]
|
||||
title = self.get_title(content)
|
||||
|
||||
print("[OpenAIConnector] Completed OAI Query:\n", indent(json.dumps({ "usage": response.usage }, indent=2), ' ' * 2))
|
||||
|
||||
# Return Response
|
||||
return { "title": title, "content": content }
|
||||
|
||||
def get_title(self, markdown: str):
|
||||
lines = markdown.split('\n')
|
||||
for line in lines:
|
||||
if line.startswith("# "):
|
||||
return line.strip("# ").strip()
|
||||
return None
|
15
vreader/templates/404.html
Normal file
15
vreader/templates/404.html
Normal file
@ -0,0 +1,15 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta
|
||||
name="viewport"
|
||||
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||
/>
|
||||
<title>VReader - Article</title>
|
||||
<script src="https://cdn.tailwindcss.com"></script>
|
||||
</head>
|
||||
<body class="bg-slate-200 h-[100dvh] p-5 flex flex-col justify-between">
|
||||
{{ markdown_html|safe }}
|
||||
</body>
|
||||
</html>
|
48
vreader/templates/article.html
Normal file
48
vreader/templates/article.html
Normal file
@ -0,0 +1,48 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta
|
||||
name="viewport"
|
||||
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||
/>
|
||||
<title>VReader - {{ metadata.title }}</title>
|
||||
<script src="https://cdn.tailwindcss.com"></script>
|
||||
<style>
|
||||
#content {
|
||||
h1 {
|
||||
font-size: 1.75em;
|
||||
font-weight: 400;
|
||||
}
|
||||
h2 {
|
||||
font-size: 1.25em;
|
||||
}
|
||||
p {
|
||||
margin-top: 0.25em;
|
||||
margin-bottom: 1.5em;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body class="bg-slate-200">
|
||||
<header class="w-screen h-16 bg-slate-300 mb-5">
|
||||
<div
|
||||
class="flex px-2 h-16 w-11/12 md:w-5/6 mx-auto rounded bg-slate-300"
|
||||
>
|
||||
<a class="font-bold flex justify-center items-center" href="/">All Articles</a>
|
||||
</div>
|
||||
</header>
|
||||
<div
|
||||
id="content"
|
||||
class="w-11/12 md:w-5/6 mx-auto rounded px-10 py-5 bg-slate-300"
|
||||
>
|
||||
<div class="flex justify-center pb-5 w-full">
|
||||
<a target="_blank" href="https://www.youtube.com/watch?v={{ metadata.video_id }}">
|
||||
<img class="h-32 rounded" src="https://i.ytimg.com/vi_webp/{{ metadata.video_id }}/maxresdefault.webp"></img>
|
||||
</a>
|
||||
</div>
|
||||
<hr class="border-gray-500 pb-5" />
|
||||
{{ markdown_html|safe }}
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
152
vreader/templates/index.html
Normal file
152
vreader/templates/index.html
Normal file
@ -0,0 +1,152 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<meta
|
||||
name="viewport"
|
||||
content="width=device-width, initial-scale=0.9, user-scalable=no, viewport-fit=cover"
|
||||
/>
|
||||
<title>VReader - Home</title>
|
||||
<script src="https://cdn.tailwindcss.com"></script>
|
||||
</head>
|
||||
<body class="bg-slate-200">
|
||||
<header class="w-screen h-16 bg-slate-300 mb-5">
|
||||
<div
|
||||
class="flex px-2 h-16 w-11/12 md:w-5/6 mx-auto rounded bg-slate-300"
|
||||
>
|
||||
<span class="font-bold flex justify-center items-center">VReader</span>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<main class="flex flex-col gap-4">
|
||||
<div id="submit"
|
||||
class="flex gap-4 items-center text-lg w-11/12 md:w-4/6 mx-auto rounded px-6 py-3 bg-slate-300"
|
||||
>
|
||||
<input type="text" placeholder="YouTube URL" class="w-full p-2 bg-gray-300 text-black dark:bg-gray-700 dark:text-white">
|
||||
<button class="p-2 text-white bg-gray-500 dark:text-gray-800 hover:bg-gray-800 dark:hover:bg-gray-100" type="submit">Generate</button>
|
||||
</div>
|
||||
|
||||
{% for article in articles %}
|
||||
<a
|
||||
href="/articles/{{ article.video_id }}"
|
||||
class="flex items-center text-lg w-11/12 md:w-4/6 mx-auto rounded px-6 py-3 bg-slate-300 hover:bg-slate-400 transition-all duration-200"
|
||||
>
|
||||
<img class="h-14 md:h-24 mr-6 rounded" src="https://i.ytimg.com/vi_webp/{{ article.video_id }}/maxresdefault.webp"></img>
|
||||
<span>{{ article.title }}</span>
|
||||
</a>
|
||||
{% endfor %}
|
||||
</main>
|
||||
<script>
|
||||
const LOADING_SVG = `<svg
|
||||
class="w-full"
|
||||
width="24"
|
||||
height="24"
|
||||
viewBox="0 0 24 24"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
fill="currentColor"
|
||||
>
|
||||
<style>
|
||||
.spinner_qM83 {
|
||||
animation: spinner_8HQG 1.05s infinite;
|
||||
}
|
||||
.spinner_oXPr {
|
||||
animation-delay: 0.1s;
|
||||
}
|
||||
.spinner_ZTLf {
|
||||
animation-delay: 0.2s;
|
||||
}
|
||||
@keyframes spinner_8HQG {
|
||||
0%,
|
||||
57.14% {
|
||||
animation-timing-function: cubic-bezier(0.33, 0.66, 0.66, 1);
|
||||
transform: translate(0);
|
||||
}
|
||||
28.57% {
|
||||
animation-timing-function: cubic-bezier(0.33, 0, 0.66, 0.33);
|
||||
transform: translateY(-6px);
|
||||
}
|
||||
100% {
|
||||
transform: translate(0);
|
||||
}
|
||||
}
|
||||
</style>
|
||||
<circle class="spinner_qM83" cx="4" cy="12" r="3"></circle>
|
||||
<circle class="spinner_qM83 spinner_oXPr" cx="12" cy="12" r="3"></circle>
|
||||
<circle class="spinner_qM83 spinner_ZTLf" cx="20" cy="12" r="3"></circle>
|
||||
</svg>`;
|
||||
|
||||
/**
|
||||
* Wrapper API Call
|
||||
**/
|
||||
function apiCall(data) {
|
||||
let fetchObj = {
|
||||
method: data.method || "GET",
|
||||
headers: {
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
};
|
||||
|
||||
if (fetchObj.method == "POST")
|
||||
fetchObj.body = JSON.stringify(data.data || {});
|
||||
|
||||
return fetch(data.url, fetchObj).then((resp) => resp.json());
|
||||
}
|
||||
|
||||
function getVideoArticle(videoID) {
|
||||
return apiCall({
|
||||
url: "/api/v1/generate",
|
||||
method: "POST",
|
||||
data: { video: videoID },
|
||||
});
|
||||
}
|
||||
|
||||
function generateAction(){
|
||||
let inputEl = document.querySelector("input");
|
||||
let inputVal = inputEl.value;
|
||||
let videoID = getYouTubeVideoId(inputVal);
|
||||
if (!videoID) return alert("Invalid URL")
|
||||
|
||||
// Loading
|
||||
let submitEl = document.querySelector("#submit");
|
||||
let oldHTML = submitEl.innerHTML;
|
||||
submitEl.innerHTML = LOADING_SVG;
|
||||
|
||||
// Do API Call
|
||||
apiCall({
|
||||
url: "/api/v1/generate",
|
||||
method: "POST",
|
||||
data: { video: videoID },
|
||||
}).then((resp) => {
|
||||
if ("error" in resp) throw new Error(resp.error);
|
||||
window.location.href = "/articles/" + videoID;
|
||||
}).catch(e => {
|
||||
console.log(e);
|
||||
alert(e.message);
|
||||
submitEl.innerHTML = oldHTML;
|
||||
});
|
||||
}
|
||||
|
||||
function initListeners(){
|
||||
let buttonEl = document.querySelector("button");
|
||||
let inputEl = document.querySelector("input");
|
||||
buttonEl.addEventListener("click", generateAction);
|
||||
inputEl.addEventListener("keydown", function(event) {
|
||||
if (event.keyCode !== 13) return;
|
||||
generateAction();
|
||||
});
|
||||
}
|
||||
|
||||
function getYouTubeVideoId(url) {
|
||||
var regExp = /^.*(?:youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^#\&\?]*).*/;
|
||||
var match = url.match(regExp);
|
||||
if (match && match[1]) {
|
||||
return match[1];
|
||||
} else {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
initListeners();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
42
vreader/video.py
Normal file
42
vreader/video.py
Normal file
@ -0,0 +1,42 @@
|
||||
import os
|
||||
from yt_dlp import YoutubeDL
|
||||
import xml.etree.ElementTree as ET
|
||||
|
||||
class VideoManager():
|
||||
"""Transcribe Videos"""
|
||||
|
||||
def transcribe_video(self, video_id: str):
|
||||
URLS = [video_id]
|
||||
|
||||
vid = YoutubeDL({
|
||||
"skip_download": True,
|
||||
"writesubtitles": True,
|
||||
"writeautomaticsub": True,
|
||||
"subtitleslangs": ["en"],
|
||||
"subtitlesformat": "ttml",
|
||||
"outtmpl": "transcript"
|
||||
})
|
||||
|
||||
vid.download(URLS)
|
||||
content = self.convert_ttml_to_plain_text("transcript.en.ttml")
|
||||
os.remove("transcript.en.ttml")
|
||||
|
||||
return content
|
||||
|
||||
|
||||
def convert_ttml_to_plain_text(self, ttml_file_path):
|
||||
try:
|
||||
# Parse the TTML file
|
||||
tree = ET.parse(ttml_file_path)
|
||||
root = tree.getroot()
|
||||
|
||||
# Process Text
|
||||
plain_text = ""
|
||||
for elem in root.iter():
|
||||
if elem.text:
|
||||
plain_text += elem.text + " "
|
||||
|
||||
return plain_text.strip()
|
||||
except ET.ParseError as e:
|
||||
print("[VideoManager] TTML Conversion Error:", e)
|
||||
return None
|
Loading…
Reference in New Issue
Block a user