Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / NVIDIA-Merlin/HugeCTR issues and pull requests

#362 - [BUG] HPS tensorflow plugin, multi-gpu example crashes

Issue - State: closed - Opened by Jeffery-Song about 2 years ago - 3 comments

#361 - [BUG] WDL training notebook for HugeCTR processing workflow fails with TypeError

Issue - State: closed - Opened by Spartee about 2 years ago - 5 comments
Labels: bug, P0

#360 - [BUG] Criteo example fail due to "Runtime error: file list open failed: ./criteo_data/file_list.txt"

Issue - State: closed - Opened by thuningxu about 2 years ago - 3 comments
Labels: bug, P0

#359 - Typo:dropoutlayer rate typo

Pull Request - State: closed - Opened by JacoCheung about 2 years ago - 1 comment

#358 - Maybe there is a typo in API docs?

Issue - State: closed - Opened by MARD1NO about 2 years ago - 2 comments

#357 - [Question] Why hybrid embedding is bound with async data reading?

Issue - State: closed - Opened by regnnighe about 2 years ago - 1 comment
Labels: question

#356 - [BUG] Segmentation fault using cudf after HugeCTR model run

Issue - State: closed - Opened by oliverholworthy over 2 years ago - 4 comments
Labels: bug, P1

#355 - Update triage github actions workflow

Pull Request - State: closed - Opened by benfred over 2 years ago - 1 comment

#354 - Build fixes for cudf 22.06

Pull Request - State: closed - Opened by benfred over 2 years ago - 1 comment

#353 - [BUG] HugeCTR doesn't compile with cudf v22.06+

Issue - State: closed - Opened by benfred over 2 years ago

#352 - Build docs with nightly container

Pull Request - State: closed - Opened by mikemckiernan over 2 years ago - 2 comments
Labels: fea::doc

#351 - [Requirement]Profiling operations for HugeCTR

Issue - State: closed - Opened by howiejayz over 2 years ago - 3 comments
Labels: fea::functional, P1, requirement

#350 - [Question] Low throughtput with our testing data.

Issue - State: closed - Opened by AtekiRyu over 2 years ago - 4 comments
Labels: question, P1, TBD

#349 - Docs try hps

Pull Request - State: closed - Opened by mikemckiernan over 2 years ago - 1 comment

#348 - [BUG] The runtime GPU memory cost is high?

Issue - State: closed - Opened by AtekiRyu over 2 years ago - 5 comments

#347 - [Requirement] Movie-Lens example Enhancement

Issue - State: closed - Opened by jershi425 over 2 years ago - 1 comment
Labels: P1

#346 - [BUG] Installing of sparse_operation_kit from pip failed

Issue - State: closed - Opened by silpara over 2 years ago - 17 comments

#344 - [Question] Can HugeCTR run on CPU only machine ?

Issue - State: closed - Opened by AtekiRyu over 2 years ago - 2 comments
Labels: question

#343 - Remove unnecessary deps from docs build

Pull Request - State: closed - Opened by mikemckiernan over 2 years ago - 1 comment

#342 - V3.8 changes

Pull Request - State: closed - Opened by minseokl over 2 years ago - 1 comment

#341 - [BUG]Trial bug to test the automation

Issue - State: closed - Opened by viswa-nvidia over 2 years ago

#340 - How can I enable GPU profiling for MLPerf dlrm for training result v2.0?

Issue - State: closed - Opened by regnnighe over 2 years ago - 1 comment
Labels: question

#339 - [Question] Complete guide to train wdl model using all 1TB criteo data in only six minutes

Issue - State: closed - Opened by LarryZhangy over 2 years ago - 2 comments
Labels: question

#338 - [Question] Failed to preprocess criteo data

Issue - State: closed - Opened by LarryZhangy over 2 years ago - 4 comments
Labels: question

#337 - [BUG] Failing to run on Windows / WSL 2

Issue - State: closed - Opened by oliverholworthy over 2 years ago - 1 comment

#336 - [BUG] notebooks of hugectr_wdl_prediction can not run

Issue - State: closed - Opened by LarryZhangy over 2 years ago - 4 comments

#334 - [BUG] HugeCTR Model segfaults on Tritonserver inference request

Issue - State: closed - Opened by jperez999 over 2 years ago - 7 comments

#332 - [Question] why can not use multi gpu of a example notebook

Issue - State: closed - Opened by LarryZhangy over 2 years ago - 9 comments
Labels: question

#330 - [Question] embedding table size of criteo dataset

Issue - State: closed - Opened by LarryZhangy over 2 years ago - 4 comments
Labels: question

#326 - Add samples in the Criteo Kaggle folder

Pull Request - State: closed - Opened by zehuanw over 2 years ago

#325 - v3.7 preview document update

Pull Request - State: closed - Opened by zehuanw over 2 years ago - 1 comment

#321 - [BUG] hps_demo.ipynb crashes with core dump

Issue - State: closed - Opened by PeterDykas over 2 years ago - 3 comments

#318 - [Requirement] Add python DLPack interface for HPS lookup

Issue - State: closed - Opened by nv-dlasalle over 2 years ago - 4 comments

#316 - [Requirement] Adding single GPU SOK test to CI/CD

Issue - State: closed - Opened by zehuanw over 2 years ago - 3 comments
Labels: critical, TBD

#311 - Error with docker build

Issue - State: closed - Opened by AtekiRyu over 2 years ago - 3 comments
Labels: fea::doc, fea::user experience, TBD

#305 - [BUG] Unable to run multi-node

Issue - State: closed - Opened by iidsample over 2 years ago - 9 comments
Labels: bug, P2, fea::user experience

#303 - Why HugeCTR doesn't support FTRL optimizer which is widely used at recommender system?

Issue - State: closed - Opened by AtekiRyu over 2 years ago - 4 comments
Labels: question, requirement

#298 - [BUG] Training fails if the number of parquet files is less than the number of GPUs

Issue - State: closed - Opened by leiterenato over 2 years ago - 7 comments
Labels: bug, P1, fea::user experience

#293 - question of single model multi-gpu deployment

Issue - State: closed - Opened by dulvqingyunLT almost 3 years ago - 9 comments
Labels: question

#289 - 【BUG】two different table use same param_interface

Pull Request - State: closed - Opened by marsmiao almost 3 years ago

#282 - How can I use NVTabular to generate Norm data?

Issue - State: closed - Opened by dulvqingyunLT almost 3 years ago - 3 comments
Labels: question

#277 - [Requirement] Inference_test self-contained

Issue - State: closed - Opened by albert17 almost 3 years ago - 4 comments
Labels: bug, P1, TBD

#261 - [BUG] SparseOperationKit hangs on initialization

Issue - State: closed - Opened by rllin about 3 years ago - 53 comments
Labels: bug, P0

#259 - [Question] How could we inspect the embedding files to sanity-check?

Issue - State: closed - Opened by shoyasaxa about 3 years ago - 4 comments
Labels: question, TBD

#254 - Update hugectr_user_guide.md

Pull Request - State: closed - Opened by lgardenhire about 3 years ago

#252 - Update README.md

Pull Request - State: closed - Opened by lgardenhire about 3 years ago

#239 - Update README.md

Pull Request - State: closed - Opened by lgardenhire over 3 years ago

#237 - Doc update document related to unified embedding

Pull Request - State: closed - Opened by KingsleyLiu-NV over 3 years ago

#227 - Fix wrong shape of d_row_ptrs_ for multi emb-tables in <inference_wrapper.hpp>

Pull Request - State: closed - Opened by xmh645214784 over 3 years ago - 2 comments

#220 - [Question] Custom Loss Functions and Storing embeddings in CPU

Issue - State: closed - Opened by goru001 over 3 years ago - 2 comments
Labels: question

#217 - [BUG] HugeCTR hard crashes machine upon job failure

Issue - State: closed - Opened by vinhngx over 3 years ago - 1 comment
Labels: bug, P0

#191 - A process in the process pool was terminated abruptly while the future was running or pending

Issue - State: closed - Opened by lizhen2017 almost 4 years ago - 3 comments
Labels: bug, critical

#188 - [Requirement] Supporting libsvm in input dataset

Issue - State: closed - Opened by zehuanw almost 4 years ago - 1 comment
Labels: requirement

#184 - [BUG] Random seed does not synchronize between nodes in Multi-Nodes Training.

Issue - State: closed - Opened by Kur0x about 4 years ago - 5 comments
Labels: bug, P1

#183 - [BUG]`row_offset` repeat last batch cause wrong result

Issue - State: closed - Opened by Kur0x about 4 years ago - 5 comments
Labels: bug, P0

#181 - [Question]Batch-level moving average statistics in Batch Normalization do not exchange between GPUs

Issue - State: closed - Opened by straywarrior about 4 years ago - 1 comment
Labels: bug, question, P1

#179 - [Requirement] Install version specified dependencies for CUDF in dev.Dockerfile

Issue - State: closed - Opened by XiaoleiShi-NV about 4 years ago - 2 comments
Labels: fea::functional, requirement

#175 - [Requirement] Enabling Samples / Tutorial / Notebooks in NGC container

Issue - State: closed - Opened by zehuanw about 4 years ago - 1 comment
Labels: fea::refactor, P1, fea::user experience

#171 - TensorRT support in Inference

Issue - State: closed - Opened by zehuanw about 4 years ago - 2 comments
Labels: fea::functional, requirement

#103 - Pytorch plugin

Issue - State: closed - Opened by shijieliu about 4 years ago
Labels: fea::functional, requirement, fea::user experience

#55 - Should hugectr add batch normalization offset and scale

Issue - State: closed - Opened by BookChan about 4 years ago
Labels: fea::functional, requirement

#45 - Fix compute capability values

Pull Request - State: closed - Opened by miguelusque about 4 years ago

#44 - Removed question #28. Duplicated with question #8.

Pull Request - State: closed - Opened by miguelusque about 4 years ago

#43 - Update compatibility compute to 6.0

Pull Request - State: closed - Opened by miguelusque about 4 years ago

#42 - [FEA] Make optional the number of files in Norm Dataset File List

Issue - State: closed - Opened by miguelusque about 4 years ago - 1 comment
Labels: requirement, fea::user experience

#24 - Delete the duplicated questions

Pull Request - State: closed - Opened by twoflypig over 4 years ago

#4 - Fix CMakeList to include targets for CUDNN, NCCL

Pull Request - State: closed - Opened by dmudiger almost 5 years ago - 2 comments