Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / chfoo/warcat issues and pull requests

#25 - Add 'warcat' console_scripts entry point; also ignore *.egg-info

Pull Request - State: open - Opened by dlitz over 2 years ago

#24 - http.client.BadStatusLine: http/1.1 200 OK

Issue - State: open - Opened by chris-aeviator over 3 years ago

#23 - Add force_gzip flag to WARC.load to fix #20

Pull Request - State: closed - Opened by acrois over 3 years ago

#22 - No mention of 'resource' in list at verify_refers_to

Issue - State: open - Opened by RvanVeenendaal almost 5 years ago - 2 comments
Labels: bug

#21 - [Merged OK] Add target uri filter

Pull Request - State: closed - Opened by JesseWeinstein about 5 years ago - 1 comment

#20 - pass on warc.gz error

Issue - State: closed - Opened by marked over 5 years ago - 1 comment
Labels: bug

#19 - Malformed HTTP headers lead to "ValueError: need more than 1 value to unpack" crash

Issue - State: open - Opened by JustAnotherArchivist almost 6 years ago - 1 comment
Labels: bug

#17 - Use errors='replace' when decoding HTTP headers

Pull Request - State: closed - Opened by Frogging101 almost 8 years ago - 1 comment

#16 - Handling for "files" that are purely in memory?

Issue - State: open - Opened by spott about 8 years ago - 2 comments
Labels: bug

#15 - Support payload digest of revisit records

Issue - State: open - Opened by Arkiver2 about 8 years ago - 1 comment
Labels: enhancement

#14 - Add easy way to iterate over warc records

Issue - State: open - Opened by sirex over 8 years ago - 3 comments
Labels: enhancement

#13 - URL agnostic deduplication of WARC

Issue - State: open - Opened by Arkiver2 over 8 years ago
Labels: enhancement

#12 - 'utf-8' codec can't decode byte invalid continuation byte

Issue - State: closed - Opened by fanchyna almost 9 years ago - 1 comment
Labels: bug

#11 - A name to a file object is not handled correctly

Issue - State: open - Opened by chfoo almost 9 years ago
Labels: bug

#10 - Reading in an in-memory gzip.GzipFile object breaks warcat.model.binary.BinaryFileRef objects

Issue - State: closed - Opened by d-m almost 9 years ago - 3 comments
Labels: bug

#9 - Extract performance is extremely slow on megawarcs

Issue - State: open - Opened by gwern about 9 years ago - 1 comment
Labels: help wanted

#8 - Feature: extract only files matching a regexp

Issue - State: open - Opened by gwern about 9 years ago
Labels: enhancement

#7 - Feature: extract WARCs specified with index/length

Issue - State: open - Opened by gwern about 9 years ago - 1 comment
Labels: enhancement

#6 - http.client.IncompleteRead crash during extract

Issue - State: closed - Opened by chfoo over 10 years ago - 1 comment
Labels: bug

#5 - Handle long filenames

Issue - State: closed - Opened by chfoo over 10 years ago - 1 comment
Labels: bug

#4 - Support warnings when Content-Type doesn't match what cdx-writer expects

Issue - State: closed - Opened by chfoo about 11 years ago - 1 comment
Labels: enhancement, invalid

#3 - Support warnings when WARC field name casing don't match hanzo's warc-tools.

Issue - State: open - Opened by chfoo about 11 years ago - 1 comment
Labels: enhancement

#2 - Support older Python 2.7

Issue - State: open - Opened by chfoo about 11 years ago - 2 comments
Labels: enhancement

#1 - Fields with empty values in metadata records increases block length

Issue - State: closed - Opened by chfoo about 11 years ago