sqlports info available as a json in the CDN/mirrors, yay or nay?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

sqlports info available as a json in the CDN/mirrors, yay or nay?

Juan Francisco Cantero Hurtado
More than a year ago I talked with espie@ about making the sqlports info
available as a json file in the CDNs/mirrors. My idea was to provide
that info for people who want to create simple/static javascript web
pages, external tools for monitoring the ports tree or some type of
automation system.

Nobody wants to maintain a dynamic web page to show info about the
ports. We had one (i can't even remember the domain) and was
decommissioned long time ago. A static web page reading the json from
the CDN with javascript would be quite easy to maintain. Also, other
webs could use the info (like repology.org). Or tools like [1]. You
could compare packages between versions of OpenBSD without uncompress
various copies of sqlports.

Everything would be generated by a simple python port during the bulk.
The script is now 20 lines and takes less than 10 seconds. The files
would be copied to pub/OpenBSD/{snapshots,version}/packages/arch. Maybe
only to amd64, because the info is the same. I abandoned the idea one
year ago because I wanted to convert the table Ports and the json was
27MB[2]. Now I'm splitting a limited part of Ports in small json files.

Examples:

1,3MB, FullPkgPath, COMMENT, MAINTAINER
https://juanfra.keybase.pub/tmp-ports/packages-info.json

1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME
https://juanfra.keybase.pub/tmp-ports/packages-versions.json

1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS, TEST_DEPENDS
https://juanfra.keybase.pub/tmp-ports/packages-depends.json


I've the python script but I've not started yet with the port. The
question for the porters (and in special people running the bulks,
signing/uploading packages) is, are you against this idea?.


1. https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py

2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json


--
Juan Francisco Cantero Hurtado http://juanfra.info

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Daniel Jakots-6
I think it would be indeed nice to provide a way for people to easily
see the content of the ports tree without having to take the sqlite3 db
out from sqlports.tgz.

Regarding the JSON, I think only one, with as much info as
possible with avoiding bloat, should be provided. And they should have
the field names ;)

Cheers,
Daniel

On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
<[hidden email]> wrote:

> More than a year ago I talked with espie@ about making the sqlports
> info available as a json file in the CDNs/mirrors. My idea was to
> provide that info for people who want to create simple/static
> javascript web pages, external tools for monitoring the ports tree or
> some type of automation system.
>
> Nobody wants to maintain a dynamic web page to show info about the
> ports. We had one (i can't even remember the domain) and was
> decommissioned long time ago. A static web page reading the json from
> the CDN with javascript would be quite easy to maintain. Also, other
> webs could use the info (like repology.org). Or tools like [1]. You
> could compare packages between versions of OpenBSD without uncompress
> various copies of sqlports.
>
> Everything would be generated by a simple python port during the bulk.
> The script is now 20 lines and takes less than 10 seconds. The files
> would be copied to pub/OpenBSD/{snapshots,version}/packages/arch.
> Maybe only to amd64, because the info is the same. I abandoned the
> idea one year ago because I wanted to convert the table Ports and the
> json was 27MB[2]. Now I'm splitting a limited part of Ports in small
> json files.
>
> Examples:
>
> 1,3MB, FullPkgPath, COMMENT, MAINTAINER
> https://juanfra.keybase.pub/tmp-ports/packages-info.json
>
> 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME
> https://juanfra.keybase.pub/tmp-ports/packages-versions.json
>
> 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS,
> TEST_DEPENDS
> https://juanfra.keybase.pub/tmp-ports/packages-depends.json
>
>
> I've the python script but I've not started yet with the port. The
> question for the porters (and in special people running the bulks,
> signing/uploading packages) is, are you against this idea?.
>
>
> 1.
> https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py
>
> 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json
>
>

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Landry Breuil-5
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> I think it would be indeed nice to provide a way for people to easily
> see the content of the ports tree without having to take the sqlite3 db
> out from sqlports.tgz.
>
> Regarding the JSON, I think only one, with as much info as
> possible with avoiding bloat, should be provided. And they should have
> the field names ;)

as for repology, in the meantime i've hacked my way to extract
ports-INDEX from portslist package in
http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
https://github.com/repology/repology/issues/149, because right now it's
showing misleading info.

Landry

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Juan Francisco Cantero Hurtado
In reply to this post by Daniel Jakots-6
On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> I think it would be indeed nice to provide a way for people to easily
> see the content of the ports tree without having to take the sqlite3 db
> out from sqlports.tgz.
>
> Regarding the JSON, I think only one, with as much info as
> possible with avoiding bloat, should be provided. And they should have
> the field names ;)

Done.

3,0MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-depends.json

1,8MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-info.json

1,7MB
https://juanfra.keybase.pub/tmp-ports/take2/packages-versions.json


> On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
> <[hidden email]> wrote:
>
> > More than a year ago I talked with espie@ about making the sqlports
> > info available as a json file in the CDNs/mirrors. My idea was to
> > provide that info for people who want to create simple/static
> > javascript web pages, external tools for monitoring the ports tree or
> > some type of automation system.
> >
> > Nobody wants to maintain a dynamic web page to show info about the
> > ports. We had one (i can't even remember the domain) and was
> > decommissioned long time ago. A static web page reading the json from
> > the CDN with javascript would be quite easy to maintain. Also, other
> > webs could use the info (like repology.org). Or tools like [1]. You
> > could compare packages between versions of OpenBSD without uncompress
> > various copies of sqlports.
> >
> > Everything would be generated by a simple python port during the bulk.
> > The script is now 20 lines and takes less than 10 seconds. The files
> > would be copied to pub/OpenBSD/{snapshots,version}/packages/arch.
> > Maybe only to amd64, because the info is the same. I abandoned the
> > idea one year ago because I wanted to convert the table Ports and the
> > json was 27MB[2]. Now I'm splitting a limited part of Ports in small
> > json files.
> >
> > Examples:
> >
> > 1,3MB, FullPkgPath, COMMENT, MAINTAINER
> > https://juanfra.keybase.pub/tmp-ports/packages-info.json
> >
> > 1,1MB, FullPkgPath, DISTNAME, FULLPKGNAME, PKGNAME
> > https://juanfra.keybase.pub/tmp-ports/packages-versions.json
> >
> > 1,7MB, FullPkgPath, BUILD_DEPENDS, LIB_DEPENDS, RUN_DEPENDS,
> > TEST_DEPENDS
> > https://juanfra.keybase.pub/tmp-ports/packages-depends.json
> >
> >
> > I've the python script but I've not started yet with the port. The
> > question for the porters (and in special people running the bulks,
> > signing/uploading packages) is, are you against this idea?.
> >
> >
> > 1.
> > https://github.com/danieljakots/obsd-ports-tools/blob/master/showvictims.py
> >
> > 2. https://juanfra.keybase.pub/tmp-ports/sqlports-everything.json
> >
> >
>

--
Juan Francisco Cantero Hurtado http://juanfra.info

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Hiltjo Posthuma
In reply to this post by Landry Breuil-5
On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:

> On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > I think it would be indeed nice to provide a way for people to easily
> > see the content of the ports tree without having to take the sqlite3 db
> > out from sqlports.tgz.
> >
> > Regarding the JSON, I think only one, with as much info as
> > possible with avoiding bloat, should be provided. And they should have
> > the field names ;)
>
> as for repology, in the meantime i've hacked my way to extract
> ports-INDEX from portslist package in
> http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> https://github.com/repology/repology/issues/149, because right now it's
> showing misleading info.
>
> Landry
>

I also use a similar method at the moment and wrote the shellscript below for
my own use:


#!/bin/sh
# fetch generated ports INDEX from databases/sqlports (portslist) from
# packages snapshots.

mkdir -p dl
cd dl || exit 1

pubkey="/etc/signify/openbsd-65-pkg.pub"
url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/"

indextxt="${url}index.txt"
ftp "${indextxt}"

portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt)

if test x"${portslistpkg}" = x""; then
        echo "cannot find portslist package in index.txt" >&2
        exit 1
fi

portslist="${url}${portslistpkg}"
ftp "${portslist}"

# verify
ftp "${url}SHA256.sig"
if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then
        exit 1
fi

tar -xzvf "${portslistpkg}" share/ports-INDEX
cp -v share/ports-INDEX ../INDEX

--
Kind regards,
Hiltjo

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Stuart Henderson
On 2019/07/02 21:48, Hiltjo Posthuma wrote:

> On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:
> > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > > I think it would be indeed nice to provide a way for people to easily
> > > see the content of the ports tree without having to take the sqlite3 db
> > > out from sqlports.tgz.
> > >
> > > Regarding the JSON, I think only one, with as much info as
> > > possible with avoiding bloat, should be provided. And they should have
> > > the field names ;)
> >
> > as for repology, in the meantime i've hacked my way to extract
> > ports-INDEX from portslist package in
> > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> > https://github.com/repology/repology/issues/149, because right now it's
> > showing misleading info.

"That's the best I can do in the current situation." -> well, removing
OpenBSD from the list would be better than that... Surely with the amount
of parsing they're doing they noticed the old INDEX file in cvs was junk
with duplicate entries etc anyway?

> I also use a similar method at the moment and wrote the shellscript below for
> my own use:
>
>
> #!/bin/sh
> # fetch generated ports INDEX from databases/sqlports (portslist) from
> # packages snapshots.
>
> mkdir -p dl
> cd dl || exit 1
>
> pubkey="/etc/signify/openbsd-65-pkg.pub"
> url="$(cat /etc/installurl)/snapshots/packages/$(uname -m)/"
>
> indextxt="${url}index.txt"
> ftp "${indextxt}"
>
> portslistpkg=$(grep -m 1 -o '\(portslist-[0-9p\.]*\.tgz\)$' < index.txt)
>
> if test x"${portslistpkg}" = x""; then
> echo "cannot find portslist package in index.txt" >&2
> exit 1
> fi
>
> portslist="${url}${portslistpkg}"
> ftp "${portslist}"
>
> # verify
> ftp "${url}SHA256.sig"
> if ! signify -C -p "${pubkey}" -x SHA256.sig "${portslistpkg}"; then
> exit 1
> fi
>
> tar -xzvf "${portslistpkg}" share/ports-INDEX
> cp -v share/ports-INDEX ../INDEX

I took the simpler approach

pkg_add portslist
ln -s /usr/local/share/ports-INDEX /usr/ports/INDEX

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Marc Espie-2
On Tue, Jul 02, 2019 at 09:44:51PM +0100, Stuart Henderson wrote:

> On 2019/07/02 21:48, Hiltjo Posthuma wrote:
> > On Tue, Jul 02, 2019 at 07:02:04PM +0200, Landry Breuil wrote:
> > > On Tue, Jul 02, 2019 at 12:21:50PM -0400, Daniel Jakots wrote:
> > > > I think it would be indeed nice to provide a way for people to easily
> > > > see the content of the ports tree without having to take the sqlite3 db
> > > > out from sqlports.tgz.
> > > >
> > > > Regarding the JSON, I think only one, with as much info as
> > > > possible with avoiding bloat, should be provided. And they should have
> > > > the field names ;)
> > >
> > > as for repology, in the meantime i've hacked my way to extract
> > > ports-INDEX from portslist package in
> > > http://ftp.fr.openbsd.org/pub/ports-INDEX to hopefully fix
> > > https://github.com/repology/repology/issues/149, because right now it's
> > > showing misleading info.
>
> "That's the best I can do in the current situation." -> well, removing
> OpenBSD from the list would be better than that... Surely with the amount
> of parsing they're doing they noticed the old INDEX file in cvs was junk
> with duplicate entries etc anyway?

Those guys just don't want to be bothered.
We don't matter enough for them to do some decent job out of it.

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Daniel Jakots-6
In reply to this post by Juan Francisco Cantero Hurtado
On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
<[hidden email]> wrote:

> I've the python script but I've not started yet with the port. The
> question for the porters (and in special people running the bulks,
> signing/uploading packages) is, are you against this idea?

It looks like to me no one has any objection :)

Reply | Threaded
Open this post in threaded view
|

Re: sqlports info available as a json in the CDN/mirrors, yay or nay?

Juan Francisco Cantero Hurtado
On Mon, Jul 15, 2019 at 09:00:46PM -0400, Daniel Jakots wrote:
> On Tue, 2 Jul 2019 17:42:07 +0200, Juan Francisco Cantero Hurtado
> <[hidden email]> wrote:
>
> > I've the python script but I've not started yet with the port. The
> > question for the porters (and in special people running the bulks,
> > signing/uploading packages) is, are you against this idea?
>
> It looks like to me no one has any objection :)
>

The script:
https://github.com/juanfra684/ports-json

The JSON:
https://juanfra.keybase.pub/tmp-ports/take3/packages.json

The size of the file is 14MB without compression and 2.7MB with gzip -1.
Some of our mirrors support gzip compression for some file types.

The file is mostly the same as in previous versions but I added a new
object at the start of the file to add some basic info about the build,
like the date, which I would like to use in the future. Also, the file
now includes the BROKEN info.

The columns used by the script are:

+-------------------------+-----------+
| PathId                  | DISCARDED |
+-------------------------+-----------+
| FullPkgPath             | SELECTED  |
+-------------------------+-----------+
| AUTOCONF_VERSION        | DISCARDED |
+-------------------------+-----------+
| AUTOMAKE_VERSION        | DISCARDED |
+-------------------------+-----------+
| BUILD_DEPENDS           | SELECTED  |
+-------------------------+-----------+
| CATEGORIES              | SELECTED  |
+-------------------------+-----------+
| COMES_WITH              | DISCARDED |
+-------------------------+-----------+
| COMMENT                 | SELECTED  |
+-------------------------+-----------+
| COMPILER                | DISCARDED |
+-------------------------+-----------+
| COMPILER_LANGS          | DISCARDED |
+-------------------------+-----------+
| COMPILER_LINKS          | DISCARDED |
+-------------------------+-----------+
| CONFIGURE_ARGS          | DISCARDED |
+-------------------------+-----------+
| CONFIGURE_STYLE         | DISCARDED |
+-------------------------+-----------+
| DESCR                   | DISCARDED |
+-------------------------+-----------+
| DESCR_CONTENTS          | SELECTED  |
+-------------------------+-----------+
| DISTFILES               | DISCARDED |
+-------------------------+-----------+
| DISTNAME                | SELECTED  |
+-------------------------+-----------+
| DIST_SUBDIR             | DISCARDED |
+-------------------------+-----------+
| DPB_PROPERTIES          | DISCARDED |
+-------------------------+-----------+
| EPOCH                   | DISCARDED |
+-------------------------+-----------+
| FIX_EXTRACT_PERMISSIONS | DISCARDED |
+-------------------------+-----------+
| FLAVORS                 | DISCARDED |
+-------------------------+-----------+
| FULLPKGNAME             | SELECTED  |
+-------------------------+-----------+
| GH_ACCOUNT              | DISCARDED |
+-------------------------+-----------+
| GH_COMMIT               | DISCARDED |
+-------------------------+-----------+
| GH_PROJECT              | DISCARDED |
+-------------------------+-----------+
| GH_TAGNAME              | DISCARDED |
+-------------------------+-----------+
| HOMEPAGE                | SELECTED  |
+-------------------------+-----------+
| IGNORE                  | DISCARDED |
+-------------------------+-----------+
| IS_INTERACTIVE          | DISCARDED |
+-------------------------+-----------+
| LIB_DEPENDS             | SELECTED  |
+-------------------------+-----------+
| MAINTAINER              | SELECTED  |
+-------------------------+-----------+
| MAKEFILE_LIST           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES            | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES0           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES1           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES2           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES3           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES4           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES5           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES6           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES7           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES8           | DISCARDED |
+-------------------------+-----------+
| MASTER_SITES9           | DISCARDED |
+-------------------------+-----------+
| MODULES                 | SELECTED  |
+-------------------------+-----------+
| NOT_FOR_ARCHS           | SELECTED  |
+-------------------------+-----------+
| NO_BUILD                | SELECTED  |
+-------------------------+-----------+
| NO_TEST                 | SELECTED  |
+-------------------------+-----------+
| ONLY_FOR_ARCHS          | SELECTED  |
+-------------------------+-----------+
| PATCHFILES              | DISCARDED |
+-------------------------+-----------+
| PERMIT_DISTFILES        | DISCARDED |
+-------------------------+-----------+
| PERMIT_DISTFILES_FTP    | DISCARDED |
+-------------------------+-----------+
| PERMIT_PACKAGE          | SELECTED  |
+-------------------------+-----------+
| PERMIT_PACKAGE_CDROM    | DISCARDED |
+-------------------------+-----------+
| PERMIT_PACKAGE_FTP      | DISCARDED |
+-------------------------+-----------+
| PKGNAME                 | SELECTED  |
+-------------------------+-----------+
| PKGPATHS                | DISCARDED |
+-------------------------+-----------+
| PKGSPEC                 | DISCARDED |
+-------------------------+-----------+
| PKGSTEM                 | DISCARDED |
+-------------------------+-----------+
| PKG_ARCH                | DISCARDED |
+-------------------------+-----------+
| PORTROACH               | DISCARDED |
+-------------------------+-----------+
| PORTROACH_COMMENT       | DISCARDED |
+-------------------------+-----------+
| PREFIX                  | DISCARDED |
+-------------------------+-----------+
| PSEUDO_FLAVOR           | DISCARDED |
+-------------------------+-----------+
| PSEUDO_FLAVORS          | DISCARDED |
+-------------------------+-----------+
| README                  | DISCARDED |
+-------------------------+-----------+
| README_CONTENTS         | DISCARDED |
+-------------------------+-----------+
| REVISION                | DISCARDED |
+-------------------------+-----------+
| RUN_DEPENDS             | SELECTED  |
+-------------------------+-----------+
| SEPARATE_BUILD          | DISCARDED |
+-------------------------+-----------+
| SHARED_LIBS             | SELECTED  |
+-------------------------+-----------+
| STATIC_PLIST            | DISCARDED |
+-------------------------+-----------+
| SUBPACKAGE              | DISCARDED |
+-------------------------+-----------+
| SUBST_VARS              | DISCARDED |
+-------------------------+-----------+
| SUPDISTFILES            | DISCARDED |
+-------------------------+-----------+
| TARGETS                 | DISCARDED |
+-------------------------+-----------+
| TEST_DEPENDS            | SELECTED  |
+-------------------------+-----------+
| TEST_IS_INTERACTIVE     | DISCARDED |
+-------------------------+-----------+
| UPDATE_PLIST_ARGS       | DISCARDED |
+-------------------------+-----------+
| USE_GMAKE               | DISCARDED |
+-------------------------+-----------+
| USE_GROFF               | DISCARDED |
+-------------------------+-----------+
| USE_LIBTOOL             | DISCARDED |
+-------------------------+-----------+
| USE_LLD                 | DISCARDED |
+-------------------------+-----------+
| USE_WXNEEDED            | SELECTED  |
+-------------------------+-----------+
| WANTLIB                 | SELECTED  |
+-------------------------+-----------+


I will try to send a port in the next weeks.


--
Juan Francisco Cantero Hurtado http://juanfra.info