Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / HTTPArchive/data-pipeline issues and pull requests
#179 - Add check for valid GZIPs
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 7 comments
#178 - Specify workflows service account in deploy GHA
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#177 - Add report generation to workflow (move off of worker VM)
Issue -
State: closed - Opened by giancarloaf about 2 years ago
- 2 comments
#176 - Bump black from 22.12.0 to 23.1.0
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 2 comments
Labels: dependencies, python
#175 - Adding Firefox use counter data to HTTPArchive
Issue -
State: open - Opened by zcorpan about 2 years ago
- 8 comments
#174 - Add Google Container Registry to dependabot
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#173 - Bump dataflow flex template build tag
Pull Request -
State: closed - Opened by github-actions[bot] about 2 years ago
- 1 comment
#172 - Fix dataflow flex template version bumping github action
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#171 - Fix Google Cloud deployment GitHub Actions
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#170 - Bump github/super-linter from 4.10.0 to 4.10.1
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 1 comment
Labels: dependencies, github_actions
#169 - Bump apache-beam[gcp] from 2.43.0 to 2.44.0
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 2 comments
Labels: dependencies, python
#168 - Add GitHub Action to build flex template and deploy workflow on merge to main branch
Issue -
State: closed - Opened by giancarloaf about 2 years ago
#167 - Bump actions/upload-artifact from 3.1.1 to 3.1.2
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 1 comment
Labels: dependencies, github_actions
#166 - Bump github/super-linter from 4.9.7 to 4.10.0
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 1 comment
Labels: dependencies, github_actions
#162 - Add Dataflow flex templates and GCP Workflows
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 3 comments
#161 - Update har manifest instructions
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#160 - Investigate why the Dataflow jobs stall on the remaining HARs
Issue -
State: closed - Opened by rviscomi about 2 years ago
- 1 comment
Labels: bug
#159 - Add example flex template commands to the README
Pull Request -
State: closed - Opened by rviscomi about 2 years ago
- 1 comment
#158 - Bump black from 22.10.0 to 22.12.0
Pull Request -
State: closed - Opened by dependabot[bot] about 2 years ago
- 1 comment
Labels: dependencies, python
#157 - Add HAR manifest generation steps to README
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 1 comment
#156 - Fix home page filtering for summary requests
Pull Request -
State: closed - Opened by giancarloaf about 2 years ago
- 2 comments
#155 - Run November BigQuery pipeline
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 9 comments
#154 - Bump apache-beam[gcp] from 2.41.0 to 2.43.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 2 comments
Labels: dependencies, python
#153 - Too many pages in `summary_requests`
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
Labels: bug
#152 - Combined data pipeline failed to process October dataset
Issue -
State: closed - Opened by rviscomi over 2 years ago
Labels: bug
#151 - Bump actions/upload-artifact from 3.1.0 to 3.1.1
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 1 comment
Labels: dependencies, github_actions
#150 - Create auto updating `sample_data` queries
Issue -
State: closed - Opened by tunetheweb over 2 years ago
- 5 comments
#149 - The new schema and cost concerns for users
Issue -
State: closed - Opened by tunetheweb over 2 years ago
- 6 comments
#148 - Blink feature tables haven't been updated
Issue -
State: closed - Opened by tunetheweb over 2 years ago
- 1 comment
Labels: bug
#147 - Bump apache-beam[gcp] from 2.41.0 to 2.42.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 2 comments
Labels: dependencies, python
#146 - Bump ewjoachim/python-coverage-comment-action from 2.1.0 to 3.0.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 2 comments
Labels: dependencies, github_actions
#145 - Bump black from 22.8.0 to 22.10.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 1 comment
Labels: dependencies, python
#144 - Bump github/super-linter from 4.9.6 to 4.9.7
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 1 comment
Labels: dependencies, github_actions
#143 - Explore integrating with Cloudflare's Domain Intelligence API
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 4 comments
#142 - Investigate views on `all` dataset
Issue -
State: closed - Opened by tunetheweb over 2 years ago
- 8 comments
#141 - Convert the `latest` dataset to views
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 16 comments
Labels: enhancement
#140 - Improve test coverage of summary pipeline
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#139 - Account for redirects when selecting secondary page URL
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 2 comments
Labels: bug
#138 - Create a process to backfill `all` tables from HARs older than March 2022
Issue -
State: closed - Opened by giancarloaf over 2 years ago
#137 - Bump black from 22.6.0 to 22.8.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 1 comment
Labels: dependencies, python
#136 - Bug/backfill fixes
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#135 - Bump apache-beam[gcp] from 2.40.0 to 2.41.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
- 2 comments
Labels: dependencies, python
#134 - Improve test coverage
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#133 - August "combined" tables contain duplicates
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 5 comments
Labels: bug
#132 - Add python code coverage badge to README.md
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#131 - Add python code coverage
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#130 - Add unit tests for beam functionality
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
#129 - Subscription for "releases" at the end of a pipeline run
Issue -
State: open - Opened by giancarloaf over 2 years ago
- 2 comments
#128 - Pub/Sub Bottleneck
Issue -
State: closed - Opened by pmeenan over 2 years ago
- 1 comment
#127 - Assess secondary page quality
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 2 comments
Labels: question
#126 - Fix `doctype` - stringify output
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
Labels: bug
#125 - Bump github/super-linter from 4.9.5 to 4.9.6
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
Labels: dependencies, github_actions
#124 - Summary statistics for BigQuery `all` tables
Issue -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
#123 - Fix parsed_css root page field
Pull Request -
State: closed - Opened by rviscomi over 2 years ago
#122 - Bump github/super-linter from 4.9.4 to 4.9.5
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
Labels: dependencies, github_actions
#121 - Deprecate streaming pipeline
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
#120 - Set up mechanism for triggering the batch pipeline
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
Labels: enhancement
#119 - Switch data pipeline defaults from streaming to batch
Issue -
State: closed - Opened by rviscomi over 2 years ago
Labels: enhancement
#118 - Standard library of custom BigQuery functions
Issue -
State: open - Opened by rviscomi over 2 years ago
- 1 comment
Labels: enhancement
#117 - Clean up intermediary load job tables on BQ
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 2 comments
Labels: bug
#116 - Secondary pages marked as root pages in parsed CSS table
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
Labels: bug
#115 - Pipe parsed_css custom metric data into BQ
Pull Request -
State: closed - Opened by rviscomi over 2 years ago
#114 - Improved documentation for the reports
Issue -
State: closed - Opened by Themanwithoutaplan over 2 years ago
- 2 comments
#113 - Crawlid to be continued after EOL of batching
Issue -
State: closed - Opened by Themanwithoutaplan over 2 years ago
- 5 comments
#112 - Page summary reports contain duplicates
Issue -
State: closed - Opened by Themanwithoutaplan over 2 years ago
- 4 comments
Labels: bug
#111 - Combined pipeline fixes from July 2022 crawl
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
#110 - GoogleAPICallError: None POST https://bigquery.googleapis.com/bigquery/v2/[...]/insertAll Error 413 (Request Entity Too Large)!!1
Issue -
State: closed - Opened by giancarloaf over 2 years ago
- 1 comment
Labels: bug
#109 - `All` pipeline improvements
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 2 comments
#108 - Reformat to avoid linter errors
Pull Request -
State: closed - Opened by tunetheweb over 2 years ago
#107 - Bump black from 22.3.0 to 22.6.0
Pull Request -
State: closed - Opened by dependabot[bot] over 2 years ago
Labels: dependencies, python
#106 - Add pip to dependabot.yml
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
#105 - Update non-summary partitioning
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 4 comments
#104 - Bump beam SDK to v2.40
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
#103 - Exceeded rate limits: too many api requests per user per method for this user_method
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 4 comments
Labels: bug
#102 - Error: The Dataflow job may be impacted by insufficient Pub/Sub quota
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 3 comments
Labels: bug
#101 - Ensure JSON pop has a default
Pull Request -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
#98 - Prepare for the July 2022 crawl
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
#97 - Python CSS parser in combined pipeline
Pull Request -
State: closed - Opened by rviscomi over 2 years ago
#96 - Python CSS parser
Pull Request -
State: closed - Opened by rviscomi over 2 years ago
- 1 comment
#94 - Investigate why the crawl stalled on the remaining 300k HARs in June 2022
Issue -
State: closed - Opened by rviscomi over 2 years ago
- 2 comments
Labels: bug
#93 - Keep secondary page data separate from home page data (for now)
Issue -
State: closed - Opened by rviscomi over 2 years ago
#92 - Eliminate the need to run batch pipelines
Issue -
State: closed - Opened by rviscomi over 2 years ago
Labels: enhancement
#91 - Combine summary and non-summary pipelines
Pull Request -
State: closed - Opened by giancarloaf over 2 years ago
- 5 comments
Labels: enhancement
#71 - Filter out hash urls
Issue -
State: open - Opened by giancarloaf over 2 years ago
- 1 comment
Labels: enhancement
#66 - Add new image formats and change typ to type
Pull Request -
State: closed - Opened by tunetheweb over 2 years ago
- 1 comment
#46 - Interacting on websites
Issue -
State: open - Opened by nrllh almost 3 years ago
- 15 comments
#43 - Discrepancies between experimental and legacy summary_pages pipelines
Issue -
State: closed - Opened by rviscomi almost 3 years ago
- 13 comments
Labels: bug
#38 - Combine Dataflow pipelines
Issue -
State: closed - Opened by rviscomi almost 3 years ago
- 1 comment
#31 - Keep technology detections up to date
Issue -
State: open - Opened by rviscomi about 3 years ago
- 2 comments
#30 - Keep feature counters up to date
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 1 comment
#27 - Optimize GCP costs
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 2 comments
#26 - Backfill historical data through new analysis pipeline
Issue -
State: closed - Opened by rviscomi about 3 years ago
#24 - Monitor for missing reports
Issue -
State: closed - Opened by tunetheweb about 3 years ago
- 4 comments
#21 - Document how the new analysis pipeline works, including metrics
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 1 comment
#19 - Build a new monthly analysis pipeline based on an evergreen version of Web Almanac queries
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 2 comments
#18 - Pre-parse CSS in Dataflow before writing to BigQuery
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 1 comment
#15 - Reorganize the BigQuery datasets to be more efficient
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 15 comments
#8 - Achieve 100% test coverage for all new pipeline code
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 2 comments
#6 - Document how the new GCP pipeline works
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 2 comments
#3 - Add the ability to monitor each stage of the GCP pipeline
Issue -
State: closed - Opened by rviscomi about 3 years ago
- 8 comments