Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / internetarchive/Zeno issues and pull requests

#132 - Panic on /workers access

Issue - State: open - Opened by CorentinB about 1 month ago
Labels: bug, P1

#131 - Verify & test our XML extraction in the context of sitemaps

Issue - State: open - Opened by CorentinB about 1 month ago
Labels: enhancement, P3

#130 - fix: ensure there is no infinite recursion of URLs

Pull Request - State: closed - Opened by NGTmeaty about 1 month ago

#129 - Add additional tests and validate current URL behavior

Pull Request - State: closed - Opened by NGTmeaty about 1 month ago
Labels: enhancement

#128 - Fix the handover bypass when seeds are loaded from list and handover is disabled

Pull Request - State: closed - Opened by equals215 about 1 month ago
Labels: bug, P1

#127 - Add Pyroscope profiling support

Pull Request - State: closed - Opened by NGTmeaty about 1 month ago
Labels: enhancement

#126 - Add proper YouTube archiving via YT-DLP

Pull Request - State: open - Opened by CorentinB about 1 month ago
Labels: enhancement, internal-only, P3

#125 - Correct seedList enqueuing process

Pull Request - State: closed - Opened by equals215 about 1 month ago
Labels: enhancement, P2

#124 - Ingest seeds before starting workers

Pull Request - State: closed - Opened by CorentinB about 1 month ago
Labels: bug

#123 - Send on closed channel panic

Issue - State: open - Opened by CorentinB about 2 months ago - 2 comments
Labels: bug, P1

#122 - Split Zeno in smaller packages with a better structure

Pull Request - State: open - Opened by equals215 about 2 months ago - 3 comments
Labels: enhancement, internal-only, P1

#121 - Queue all items from seeds list before starting to crawl

Issue - State: closed - Opened by CorentinB about 2 months ago
Labels: enhancement, P3

#120 - Calming down queue stats

Pull Request - State: closed - Opened by yzqzss about 2 months ago

#119 - Change default queue behaviour

Pull Request - State: closed - Opened by equals215 about 2 months ago
Labels: enhancement, internal-only, P1

#118 - WAL tests fail

Issue - State: closed - Opened by CorentinB about 2 months ago - 3 comments
Labels: bug, P1

#117 - Enable `log` package to distribute a stored logger to all other packages

Pull Request - State: closed - Opened by equals215 about 2 months ago - 3 comments
Labels: bug, enhancement, P1

#116 - Lock-free AwaitWALCommitted (and smoother queue?)

Pull Request - State: closed - Opened by yzqzss about 2 months ago

#115 - Define Zeno's queuing behavior properly

Issue - State: closed - Opened by CorentinB about 2 months ago - 2 comments
Labels: enhancement, internal-only, P1

#114 - Fix idna.ToASCII fail on punycode encoded URLs with port

Pull Request - State: closed - Opened by CorentinB about 2 months ago
Labels: bug

#113 - Transform Zeno architecture to a crawling pipeline effectively making use of Go channels

Issue - State: open - Opened by equals215 about 2 months ago
Labels: enhancement, internal-only, P2

#112 - Break down Zeno in smaller packages, especially `crawl` package which has grown too big

Issue - State: open - Opened by equals215 about 2 months ago
Labels: enhancement, internal-only, P1

#111 - Reuse free space from popped items

Pull Request - State: open - Opened by equals215 about 2 months ago - 3 comments
Labels: enhancement, internal-only

#110 - fix: hanging on indexManager.Close()

Pull Request - State: closed - Opened by yzqzss about 2 months ago - 2 comments

#109 - Improve WAL concurrency performance by @yzqzss and make it optional

Pull Request - State: closed - Opened by equals215 about 2 months ago - 13 comments

#108 - SIGSEGV logging in BatchEnqueue

Issue - State: closed - Opened by CorentinB about 2 months ago - 3 comments
Labels: bug, P1

#107 - Using group commit to improve WAL concurrency performance

Pull Request - State: closed - Opened by yzqzss about 2 months ago - 1 comment
Labels: enhancement

#106 - Queue handover v2

Pull Request - State: closed - Opened by equals215 about 2 months ago
Labels: enhancement, internal-only

#105 - Remove `runtime.Gosched()` in polling

Pull Request - State: closed - Opened by yzqzss about 2 months ago - 1 comment
Labels: enhancement

#104 - Optimize `get list` loading performance

Pull Request - State: closed - Opened by yzqzss about 2 months ago
Labels: enhancement

#103 - --exclude-host not found

Issue - State: closed - Opened by CorentinB about 2 months ago
Labels: bug

#102 - Fix readItemsFromQueue() CPU 100%, fix various data races, fix hang when all items are deduplicated

Pull Request - State: closed - Opened by yzqzss about 2 months ago - 5 comments
Labels: bug

#101 - Have `queue.Enqueue()` handover items to idle workers and optimize workers routines

Pull Request - State: closed - Opened by equals215 about 2 months ago - 5 comments
Labels: enhancement, internal-only

#99 - Implement host rotation and Enqueue/Dequeue access regulation via atomic booleans

Pull Request - State: closed - Opened by equals215 about 2 months ago - 2 comments
Labels: enhancement, internal-only

#98 - Add dequeue enqueue stats

Pull Request - State: open - Opened by CorentinB about 2 months ago - 2 comments

#97 - create url_string_test.go

Pull Request - State: open - Opened by willmhowes about 2 months ago - 2 comments
Labels: enhancement

#96 - Restore HQ flags

Issue - State: closed - Opened by CorentinB about 2 months ago
Labels: bug

#95 - Implement linkheader parsing

Pull Request - State: closed - Opened by HarshNarayanJha about 2 months ago - 5 comments
Labels: enhancement

#94 - Queue and Index should reuse free space

Issue - State: open - Opened by equals215 about 2 months ago
Labels: enhancement, internal-only, P2

#93 - Persist & load queue stats

Pull Request - State: closed - Opened by CorentinB about 2 months ago
Labels: enhancement, internal-only

#92 - Add logging capabilities for queue (index too) using custom `log` package

Issue - State: closed - Opened by equals215 about 2 months ago - 1 comment
Labels: enhancement, internal-only, P1

#91 - Fix commit hash in User Agent can't be calculated when not present

Pull Request - State: closed - Opened by CorentinB about 2 months ago
Labels: bug

#90 - Instantiate a `CODE_OF_CONDUCT.md` as the repo drags some traction

Issue - State: open - Opened by equals215 about 2 months ago
Labels: documentation, internal-only, P3

#89 - Panic when starting Zeno with go run

Issue - State: closed - Opened by CorentinB about 2 months ago
Labels: bug

#88 - Change log location

Pull Request - State: closed - Opened by nick2432 2 months ago - 7 comments
Labels: bug

#87 - Extract URLs from ebook formats (EPUB, MOBI..)

Issue - State: open - Opened by CorentinB 2 months ago
Labels: enhancement, P4

#86 - Extract URLs from images

Issue - State: open - Opened by CorentinB 2 months ago - 2 comments
Labels: enhancement, P4

#85 - Replace github.com/tomnomnom/linkheader with stdlib

Issue - State: closed - Opened by CorentinB 2 months ago - 10 comments
Labels: enhancement, good first issue

#84 - Replace github.com/clbanning/mxj/v2 with stdlib

Issue - State: open - Opened by CorentinB 2 months ago
Labels: enhancement

#83 - Revamp index mechanism with a WAL

Pull Request - State: closed - Opened by equals215 2 months ago - 4 comments
Labels: enhancement, internal-only

#82 - pprof API expose silently fail when port is used

Issue - State: open - Opened by CorentinB 2 months ago
Labels: bug, good first issue

#81 - Replicate the active workers count for `--live-stats` on the new version of workers

Issue - State: closed - Opened by equals215 2 months ago - 1 comment
Labels: bug, enhancement, internal-only

#80 - Fix workers not stopping properly and temp fix for workers hanging unexpectedly

Pull Request - State: closed - Opened by equals215 2 months ago - 1 comment
Labels: bug

#79 - Fix crash when --api is not set

Pull Request - State: closed - Opened by NGTmeaty 2 months ago - 1 comment
Labels: bug

#78 - Rewriting the queue

Pull Request - State: closed - Opened by CorentinB 2 months ago - 11 comments
Labels: enhancement, internal-only

#77 - Fix live-stats

Pull Request - State: closed - Opened by CorentinB 2 months ago

#76 - Add deduping stats

Pull Request - State: closed - Opened by CorentinB 2 months ago
Labels: enhancement

#75 - Remove Gin dependency

Pull Request - State: closed - Opened by CorentinB 2 months ago - 1 comment
Labels: enhancement

#74 - Add basic UI to manage Zeno

Issue - State: open - Opened by CorentinB 2 months ago
Labels: enhancement

#73 - Logs are written to the wrong location

Issue - State: closed - Opened by willmhowes 2 months ago - 4 comments
Labels: bug

#72 - live-stats flag is broken

Issue - State: closed - Opened by willmhowes 2 months ago - 1 comment
Labels: bug

#71 - fix: logfile declaration in log.go:New()

Pull Request - State: closed - Opened by equals215 3 months ago - 2 comments

#70 - Log rotation cause panic / SIGSEGV

Issue - State: closed - Opened by CorentinB 3 months ago - 2 comments
Labels: bug

#69 - Disk space pause is not working

Issue - State: open - Opened by NGTmeaty 3 months ago - 3 comments
Labels: bug

#68 - Add site-specific code for Facebook

Pull Request - State: closed - Opened by CorentinB 3 months ago - 1 comment

#67 - Rewrite of cmd+config packages using `spf13` `cobra` and `viper`

Pull Request - State: closed - Opened by equals215 3 months ago - 2 comments
Labels: enhancement

#66 - Give Zeno the logging it deserves

Pull Request - State: closed - Opened by equals215 3 months ago - 3 comments
Labels: enhancement

#65 - Add state management for workers

Pull Request - State: closed - Opened by equals215 3 months ago
Labels: enhancement

#64 - Add site-specific support for Streamable

Pull Request - State: closed - Opened by CorentinB 3 months ago

#63 - Add state management for workers

Pull Request - State: closed - Opened by equals215 3 months ago - 1 comment
Labels: enhancement

#62 - Allow crawl space threshold to be set on CLI, report space avail

Pull Request - State: closed - Opened by machawk1 4 months ago - 2 comments

#61 - Allow free space threshold to be customizable

Issue - State: closed - Opened by machawk1 4 months ago

#60 - Enhance JSON extraction & add XML extraction

Pull Request - State: closed - Opened by CorentinB 6 months ago

#59 - Add support to send a cookie with CDX requests

Pull Request - State: closed - Opened by NGTmeaty 8 months ago
Labels: enhancement

#58 - Small fixes to fix memory usage

Pull Request - State: closed - Opened by CorentinB 8 months ago

#57 - Properly save sync.map version of Frontier

Pull Request - State: closed - Opened by NGTmeaty 11 months ago
Labels: bug

#56 - Add: Telegram support

Pull Request - State: closed - Opened by CorentinB 12 months ago

#55 - Add headless / headfull capabilities to Zeno

Pull Request - State: open - Opened by CorentinB about 1 year ago
Labels: enhancement

#54 - Fix improperly escaped URLs

Pull Request - State: closed - Opened by NGTmeaty about 1 year ago
Labels: enhancement

#53 - Fix potential blocking

Pull Request - State: closed - Opened by CorentinB about 1 year ago

#52 - Fix ElasticSearch

Pull Request - State: closed - Opened by NGTmeaty about 1 year ago
Labels: bug

#51 - Limit total current requests per domain

Pull Request - State: closed - Opened by NGTmeaty about 1 year ago

#50 - Add --exclude-string & daily ES indexes

Pull Request - State: closed - Opened by CorentinB over 1 year ago

#49 - Improve logging

Pull Request - State: closed - Opened by NGTmeaty over 1 year ago

#48 - Add ElasticSearch logging

Pull Request - State: closed - Opened by CorentinB over 1 year ago

#47 - WARC update

Pull Request - State: closed - Opened by NGTmeaty over 1 year ago

#46 - Various bug fixes

Pull Request - State: closed - Opened by NGTmeaty over 1 year ago

#45 - User-Agent improvements

Pull Request - State: closed - Opened by NGTmeaty over 1 year ago

#44 - Add cloudflarestream.com support as a plugin

Pull Request - State: closed - Opened by CorentinB over 1 year ago

#43 - AWS and mismatch for User-Agent Zeno

Issue - State: closed - Opened by tvb almost 2 years ago - 2 comments

#42 - No such file or directory panic

Issue - State: closed - Opened by CorentinB almost 2 years ago

#41 - Make WARC temp dir configurable via --warc-temp-dir

Pull Request - State: closed - Opened by CorentinB about 2 years ago

#40 - Update WARC and asset extraction improvements

Pull Request - State: closed - Opened by NGTmeaty about 2 years ago
Labels: enhancement

#39 - Code cleanup and asset exclusion based on previous settings

Pull Request - State: closed - Opened by NGTmeaty about 2 years ago
Labels: enhancement

#38 - Add: seencheck when executing redirects

Pull Request - State: closed - Opened by CorentinB about 2 years ago

#37 - feat: update warc to 0.8.21

Pull Request - State: closed - Opened by NGTmeaty about 2 years ago

#36 - Various fixes

Pull Request - State: closed - Opened by NGTmeaty about 2 years ago - 1 comment
Labels: enhancement

#35 - Investigate mailto: links

Issue - State: closed - Opened by NGTmeaty about 2 years ago - 1 comment

#33 - Add: crawl rate limiter

Pull Request - State: closed - Opened by CorentinB about 2 years ago

#32 - Add setting for warc pool size and update to warc 0.8.20

Pull Request - State: closed - Opened by NGTmeaty about 2 years ago
Labels: enhancement