Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / modelscope/data-juicer issues and pull requests
#479 - support api retry
Pull Request -
State: open - Opened by drcege 1 day ago
#478 - motion_score_raft
Pull Request -
State: open - Opened by drcege 3 days ago
Labels: enhancement, dj:op
#477 - windows系统支持
Issue -
State: open - Opened by zytcharming 4 days ago
Labels: question
#476 - Update of Jupyter Notebooks
Issue -
State: open - Opened by HYLcool 4 days ago
Labels: bug, documentation
#474 - [Bug]: perplexity_filter 算子内存OOM
Issue -
State: open - Opened by weiaicunzai 5 days ago
Labels: bug
#473 - How to calculate the image_text_similarity scores for both Chinese and English?
Issue -
State: open - Opened by weiaicunzai 5 days ago
Labels: question
#472 - Optimize OP docs for better usage
Pull Request -
State: closed - Opened by HYLcool 5 days ago
Labels: documentation, enhancement
#471 - minor fix for sandbox
Pull Request -
State: open - Opened by HYLcool 6 days ago
Labels: bug
#470 - LLM造数据时需要try_num参数
Issue -
State: open - Opened by BeachWang 6 days ago
Labels: enhancement
#469 - 运行process时报错KeyError:"text_key"
Issue -
State: closed - Opened by huangkaipeng4399 9 days ago
Labels: question
#468 - [Ready] Basic Service Implementation
Pull Request -
State: open - Opened by drcege 9 days ago
Labels: enhancement
#467 - 如何获取tool_quality_classifier模块中[chinese,code,gtp3]这3个模型的权重?
Issue -
State: open - Opened by yaun248 11 days ago
- 1 comment
Labels: question
#466 - 请问有没有提供 查看被算子筛选掉的数据 的功能
Issue -
State: closed - Opened by huangkaipeng4399 11 days ago
- 2 comments
Labels: question
#465 - Auto publishment of docker images and pypi packages
Pull Request -
State: closed - Opened by HYLcool 16 days ago
Labels: enhancement
#464 - Probe-based OP Fusion & Reordering
Pull Request -
State: open - Opened by HYLcool 16 days ago
Labels: enhancement
#463 - [Ready] Add API Call & Example OPs
Pull Request -
State: closed - Opened by drcege 17 days ago
Labels: enhancement
#462 - minor fix
Pull Request -
State: closed - Opened by drcege 17 days ago
Labels: documentation
#461 - Provide a dynamic table supporting search and filtering for the paper list
Pull Request -
State: closed - Opened by zhenqincn 18 days ago
#460 - unittest opt
Pull Request -
State: closed - Opened by HYLcool 18 days ago
Labels: enhancement, priority:high
#459 - use chat_template
Pull Request -
State: closed - Opened by drcege 19 days ago
- 1 comment
Labels: bug, enhancement
#458 - How to use 'chinese_convert_mapper' ?
Issue -
State: open - Opened by abchbx 19 days ago
- 4 comments
Labels: question
#457 - How to use ‘hf_model’
Issue -
State: open - Opened by abchbx 19 days ago
- 3 comments
Labels: question
#456 - refactor paper list
Pull Request -
State: closed - Opened by zhenqincn 19 days ago
#455 - + Add docs for newly-added OP image_face_count_filter
Pull Request -
State: closed - Opened by HYLcool 20 days ago
#454 - [Ready] align sft formats & new ops
Pull Request -
State: closed - Opened by drcege 20 days ago
- 1 comment
Labels: enhancement
#453 - [Bug]: librosa use lazy_loader which depend on python version
Issue -
State: open - Opened by BeachWang 24 days ago
- 1 comment
Labels: bug
#452 - #446 Add image_face_count_filter and related tests
Pull Request -
State: closed - Opened by TobyJasper 25 days ago
#451 - [Feat]: Unified LLM Calling Management
Issue -
State: open - Opened by drcege 25 days ago
Labels: enhancement
#450 - [Feat]: Automatic Version Matching During Installation
Issue -
State: open - Opened by drcege 25 days ago
Labels: enhancement
#449 - [Feat]: Enhance Unit Test Coverage for Python and CUDA Compatibility
Issue -
State: open - Opened by drcege 25 days ago
Labels: enhancement
#448 - Optimization for batched processing
Pull Request -
State: closed - Opened by HYLcool 26 days ago
- 1 comment
Labels: enhancement, dj:op
#447 - Fix BiMix paper link
Pull Request -
State: closed - Opened by drcege 27 days ago
#446 - Feedback on image_face_ratio_filter and Suggestion for a New image_face_counter_filter Operator
Issue -
State: open - Opened by TobyJasper 29 days ago
- 2 comments
Labels: enhancement
#445 - update readme, add pai product link
Pull Request -
State: closed - Opened by Cathy0908 29 days ago
#444 - sandbox doc update
Pull Request -
State: closed - Opened by BeachWang 30 days ago
#443 - fix lazy_loader
Pull Request -
State: closed - Opened by BeachWang about 1 month ago
- 4 comments
Labels: bug
#442 - [WIP] Optimize ray mode performance
Pull Request -
State: open - Opened by pan-x-c about 1 month ago
#441 - [Bug]: test_adapter 兼容性
Issue -
State: open - Opened by FailedNamed about 1 month ago
- 2 comments
Labels: bug
#440 - [Bug]: KeyError: 'resource'
Issue -
State: open - Opened by luckystar1992 about 1 month ago
- 1 comment
Labels: bug
#439 - fix error links
Pull Request -
State: closed - Opened by yxdyc about 1 month ago
#438 - [Bug]: Paper link error
Issue -
State: closed - Opened by ForeverNewLee about 1 month ago
- 1 comment
Labels: bug, documentation
#437 - [Bug]: JupyterLab Official sample error
Issue -
State: closed - Opened by Night-Quiet about 2 months ago
- 2 comments
Labels: bug
#436 - fix check_model
Pull Request -
State: closed - Opened by Cathy0908 about 2 months ago
#435 - Refine batch op branch
Pull Request -
State: closed - Opened by BeachWang about 2 months ago
#434 - doc update for sandbox paper
Pull Request -
State: closed - Opened by BeachWang about 2 months ago
Labels: documentation
#433 - Require fps filter and mapper for videos
Issue -
State: open - Opened by BeachWang about 2 months ago
Labels: enhancement, dj:op
#432 - 支持RangeSpecifiedFieldSelector使用指定字段的值域进行数据选择
Pull Request -
State: open - Opened by 2108038773 about 2 months ago
- 1 comment
#431 - Service/match api
Pull Request -
State: closed - Opened by BeachWang about 2 months ago
Labels: enhancement, agent
#430 - why often happen: One of the subprocesses has abruptly died during map operation?
Issue -
State: closed - Opened by strongcc about 2 months ago
- 5 comments
Labels: question, stale-issue
#429 - Feat/dj adapter
Pull Request -
State: closed - Opened by HYLcool about 2 months ago
#428 - Service/match api
Pull Request -
State: closed - Opened by BeachWang about 2 months ago
#427 - Fix some words
Pull Request -
State: closed - Opened by co63oc about 2 months ago
- 1 comment
Labels: enhancement
#426 - Regress model preloading
Pull Request -
State: closed - Opened by drcege 2 months ago
#425 - 执行 python tools/process_data.py --config train.yaml 命令
Issue -
State: closed - Opened by abchbx 2 months ago
- 1 comment
Labels: question
#424 - fix param definition
Pull Request -
State: closed - Opened by Cathy0908 2 months ago
#423 - Add new OP: image_tagging_mapper
Pull Request -
State: closed - Opened by HYLcool 2 months ago
Labels: dj:multimodal, dj:op
#422 - use pydantic types
Pull Request -
State: closed - Opened by drcege 2 months ago
Labels: bug, enhancement
#421 - fix: missing args in load_formatter of Analyzer
Pull Request -
State: closed - Opened by zhijianma 2 months ago
- 2 comments
#420 - AssertionError
Issue -
State: closed - Opened by abchbx 2 months ago
- 1 comment
Labels: bug, question
#419 - [Bug]: undefined symbol: _ZN3c104cuda9SetDeviceE
Issue -
State: closed - Opened by lh61500 2 months ago
- 3 comments
Labels: bug, stale-issue
#418 - *quick fix*: NestedDataset
Pull Request -
State: closed - Opened by HYLcool 2 months ago
#417 - [Feat] Data-Juicer as a Service
Issue -
State: closed - Opened by drcege 2 months ago
- 3 comments
Labels: enhancement, stale-issue
#416 - [Feat] Enhance type hints and parameter validation
Issue -
State: closed - Opened by drcege 2 months ago
- 1 comment
Labels: bug, enhancement
#415 - Automatically split input dataset in ray mode
Pull Request -
State: open - Opened by pan-x-c 2 months ago
- 2 comments
#414 - Add lazy import and auto-install dependencies
Pull Request -
State: closed - Opened by BeachWang 2 months ago
- 2 comments
Labels: enhancement
#413 - [Feat] Support explicit `FusedOP` that allows for the configuration and application of multiple operators in smaller, manageable batches
Issue -
State: open - Opened by yxdyc 2 months ago
- 2 comments
Labels: enhancement, dj:op
#412 - [Feat] Support `PythonCodesOperator` and `BashCodesOperator` that wraps an existing python file, or some code snippets to be executed, such as the existing DJ tools.
Issue -
State: open - Opened by yxdyc 2 months ago
- 3 comments
Labels: enhancement, dj:op
#411 - Guidance for OP with multiple data fields to be processed
Issue -
State: open - Opened by yxdyc 2 months ago
- 2 comments
Labels: enhancement
#410 - Use analyzer instead of analyser to maintain consistency
Pull Request -
State: closed - Opened by garyzhang99 2 months ago
#409 - analyzer or analyzer?
Issue -
State: closed - Opened by lilqz66 2 months ago
- 1 comment
Labels: question
#408 - [WIP]Add text tagging by prompt mapper op
Pull Request -
State: open - Opened by garyzhang99 2 months ago
- 2 comments
Labels: dj:op
#407 - rename to fix typo in test_expand_macro_mapper.py
Pull Request -
State: closed - Opened by garyzhang99 2 months ago
#406 - support batch_size>1 for some operators
Pull Request -
State: closed - Opened by Cathy0908 2 months ago
- 1 comment
Labels: enhancement, dj:op
#405 - Add text_pair_similarity_filter
Pull Request -
State: open - Opened by Qirui-jiao 2 months ago
- 2 comments
Labels: enhancement, dj:multimodal, dj:op
#404 - 什么鬼呀,不管是你们huggingface空间还是自己起个服务都运行不起来,demo也运行不起来
Issue -
State: closed - Opened by coder4nlp 2 months ago
- 2 comments
#403 - Update the KDD tutorial info
Pull Request -
State: closed - Opened by yxdyc 3 months ago
#402 - Add turbo mode
Pull Request -
State: closed - Opened by drcege 3 months ago
Labels: enhancement
#401 - Add sentence_augmentation_mapper
Pull Request -
State: open - Opened by Qirui-jiao 3 months ago
- 2 comments
Labels: enhancement, dj:multimodal, dj:op
#400 - Add mllm_mapper
Pull Request -
State: open - Opened by Qirui-jiao 3 months ago
- 4 comments
Labels: enhancement, dj:multimodal, dj:op
#399 - Enhance/ckpt
Pull Request -
State: closed - Opened by drcege 3 months ago
Labels: bug, enhancement
#398 - Heavy dependency of Data-Juicer
Issue -
State: closed - Opened by BeachWang 3 months ago
- 4 comments
Labels: enhancement
#397 - update spacy to deal conflict with ms-swift
Pull Request -
State: closed - Opened by BeachWang 3 months ago
- 4 comments
Labels: enhancement
#396 - Enhance/ckpt
Pull Request -
State: closed - Opened by drcege 3 months ago
Labels: enhancement
#395 - Add sdxl_prompt2prompt_mapper
Pull Request -
State: open - Opened by Qirui-jiao 3 months ago
- 2 comments
Labels: enhancement, dj:multimodal, dj:op
#394 - Add segment_mapper
Pull Request -
State: open - Opened by Qirui-jiao 3 months ago
- 2 comments
Labels: enhancement, dj:multimodal, dj:op
#393 - Add image_pair_similarity_filter
Pull Request -
State: closed - Opened by Qirui-jiao 3 months ago
Labels: enhancement, dj:multimodal, dj:op
#392 - Fix spelling errors in documentation
Pull Request -
State: closed - Opened by TobyJasper 3 months ago
Labels: documentation
#391 - Fix an edge case when the current configuration has fewer OPs than the checkpoint
Pull Request -
State: closed - Opened by drcege 3 months ago
Labels: bug
#390 - skip inactive preloading for efficiency
Pull Request -
State: closed - Opened by drcege 3 months ago
Labels: dj:multimodal, dj:dist
#389 - + add use_cuda for get_model funcs in two OPs
Pull Request -
State: closed - Opened by HYLcool 3 months ago
Labels: bug, priority:high
#388 - [Bug]: Loading checkpoint shards:的时候直接kill了是什么,是内存不够了吗
Issue -
State: closed - Opened by ZHJ19970917 3 months ago
- 1 comment
Labels: bug
#387 - [Bug]: 去重的hash计算卡在100%上,一直不过滤
Issue -
State: closed - Opened by xiafeng-nb 3 months ago
- 6 comments
Labels: bug, priority:high
#386 - 支持RangeSpecifiedFieldSelector使用指定字段的值域进行数据选择
Pull Request -
State: closed - Opened by 2108038773 3 months ago
- 4 comments
#385 - Trust remote code param
Pull Request -
State: closed - Opened by 2108038773 3 months ago
#384 - minor fix for tutorial & add news for ImgDiff
Pull Request -
State: closed - Opened by yxdyc 3 months ago
#383 - Confused with the meaning of 'preprocess' time-consuming in the `reproduced_redpajama /README.md`
Issue -
State: closed - Opened by flyflypeng 3 months ago
- 2 comments
#382 - 所有涉及到hf_model的算子,都加了一个trust_remote_code的参数并且传递给prepare_model函数
Pull Request -
State: closed - Opened by 2108038773 3 months ago
- 2 comments
Labels: enhancement, good first issue, dj:op
#381 - Add suggestions for updating the survey.
Pull Request -
State: closed - Opened by zhenqincn 3 months ago
#380 - 是否可以为一个op设置多个text_key
Issue -
State: closed - Opened by lihongxiacream 3 months ago
- 3 comments
Labels: stale-issue
#379 - [Bug]: 运行sandbox的时候显示ModuleNotFoundError: No module named 'tools.mm_eval'
Issue -
State: closed - Opened by Snow0111 3 months ago
- 3 comments
Labels: bug