Dataset Preview
Duplicate
The full dataset viewer is not available (click to read why). Only showing a preview of the rows.
The dataset generation failed
Error code:   DatasetGenerationError
Exception:    TypeError
Message:      Couldn't cast array of type
struct<repo_id: string, repository_url: string, file_path: string, line_num: int64, func_num: int64, category: string, quality_rating: string, description: string, stars_estimate: string, notes: string, sample_id: int64, group_type: string, group_size: int64, coupling: double, group_score: double, group_difficulty: double, group_overall_score: double, group_coherence_score: double, func_names: list<item: string>, file_lines: int64, graph_stats: struct<total_functions: int64, call_edges: int64, sibling_pairs: int64>, func_name: string, start_line: int64, end_line: int64, loc: int64, complexity: double, inferability: double, fim_score: double, difficulty: double, overall_score: double, is_feasible: bool, correctness: int64, executability: int64, api_usage: int64, readability: int64, completeness: int64>
to
{'repo_id': Value('string'), 'repository_url': Value('string'), 'file_path': Value('string'), 'line_num': Value('int64'), 'func_num': Value('int64'), 'category': Value('string'), 'quality_rating': Value('string'), 'description': Value('string'), 'stars_estimate': Value('string'), 'notes': Value('string'), 'sample_id': Value('int64'), 'group_type': Value('string'), 'group_size': Value('int64'), 'coupling': Value('float64'), 'group_score': Value('float64'), 'group_difficulty': Value('float64'), 'group_overall_score': Value('float64'), 'group_coherence_score': Value('float64'), 'func_names': List(Value('string'))}
Traceback:    Traceback (most recent call last):
                File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1779, in _prepare_split_single
                  for key, table in generator:
                                    ^^^^^^^^^
                File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 609, in wrapped
                  for item in generator(*args, **kwargs):
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/json/json.py", line 295, in _generate_tables
                  self._cast_table(pa_table, json_field_paths=json_field_paths),
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/json/json.py", line 128, in _cast_table
                  pa_table = table_cast(pa_table, self.info.features.arrow_schema)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2281, in table_cast
                  return cast_table_to_schema(table, schema)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2233, in cast_table_to_schema
                  cast_array_to_feature(
                File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 1804, in wrapper
                  return pa.chunked_array([func(chunk, *args, **kwargs) for chunk in array.chunks])
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2101, in cast_array_to_feature
                  raise TypeError(f"Couldn't cast array of type\n{_short_str(array.type)}\nto\n{_short_str(feature)}")
              TypeError: Couldn't cast array of type
              struct<repo_id: string, repository_url: string, file_path: string, line_num: int64, func_num: int64, category: string, quality_rating: string, description: string, stars_estimate: string, notes: string, sample_id: int64, group_type: string, group_size: int64, coupling: double, group_score: double, group_difficulty: double, group_overall_score: double, group_coherence_score: double, func_names: list<item: string>, file_lines: int64, graph_stats: struct<total_functions: int64, call_edges: int64, sibling_pairs: int64>, func_name: string, start_line: int64, end_line: int64, loc: int64, complexity: double, inferability: double, fim_score: double, difficulty: double, overall_score: double, is_feasible: bool, correctness: int64, executability: int64, api_usage: int64, readability: int64, completeness: int64>
              to
              {'repo_id': Value('string'), 'repository_url': Value('string'), 'file_path': Value('string'), 'line_num': Value('int64'), 'func_num': Value('int64'), 'category': Value('string'), 'quality_rating': Value('string'), 'description': Value('string'), 'stars_estimate': Value('string'), 'notes': Value('string'), 'sample_id': Value('int64'), 'group_type': Value('string'), 'group_size': Value('int64'), 'coupling': Value('float64'), 'group_score': Value('float64'), 'group_difficulty': Value('float64'), 'group_overall_score': Value('float64'), 'group_coherence_score': Value('float64'), 'func_names': List(Value('string'))}
              
              The above exception was the direct cause of the following exception:
              
              Traceback (most recent call last):
                File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 1342, in compute_config_parquet_and_info_response
                  parquet_operations, partial, estimated_dataset_info = stream_convert_to_parquet(
                                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 907, in stream_convert_to_parquet
                  builder._prepare_split(split_generator=splits_generators[split], file_format="parquet")
                File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1646, in _prepare_split
                  for job_id, done, content in self._prepare_split_single(
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
                File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1832, in _prepare_split_single
                  raise DatasetGenerationError("An error occurred while generating the dataset") from e
              datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset

Need help to make the dataset viewer work? Make sure to review how to configure the dataset viewer, and open a discussion for direct support.

messages
list
metadata
dict
[ { "role": "user", "content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ...
{ "repo_id": "578", "repository_url": "https://github.com/tensorflow/tensorflow", "file_path": "/tensorflow/python/autograph/pyct/static_analysis/type_inference_test.py", "line_num": 942, "func_num": 145, "category": "Category 1: From Scratch", "quality_rating": "extra-high", "description": "TensorFlow"...
[ { "role": "user", "content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ...
{ "repo_id": "1003", "repository_url": "https://github.com/pytest-dev/pytest", "file_path": "/testing/test_debugging.py", "line_num": 1436, "func_num": 74, "category": "Testing Framework", "quality_rating": "extra-high", "description": "The pytest framework makes it easy to write small tests, yet scales...
[ { "role": "user", "content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ...
{ "repo_id": "181", "repository_url": "https://github.com/pyqtgraph/pyqtgraph", "file_path": "/pyqtgraph/graphicsItems/ScatterPlotItem.py", "line_num": 1241, "func_num": 75, "category": "Category 6: Visualization", "quality_rating": "high", "description": "Scientific graphics", "stars_estimate": "4000...
[ { "role": "user", "content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ...
{ "repo_id": "1005", "repository_url": "https://github.com/astropy/astropy", "file_path": "/astropy/nddata/covariance.py", "line_num": 1063, "func_num": 42, "category": "Scientific Computing", "quality_rating": "high", "description": "A community Python library for Astronomy.", "stars_estimate": "4500...
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"198","repository_url":"https://github.com/RustPython/RustPython","file_path":"/Lib/bdb.p(...TRUNCATED)
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"82","repository_url":"https://github.com/Rapptz/discord.py","file_path":"/discord/emoji.(...TRUNCATED)
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"1009","repository_url":"https://github.com/pylint-dev/pylint","file_path":"/tests/test_s(...TRUNCATED)
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"771","repository_url":"https://github.com/apache/spark","file_path":"/python/pyspark/sql(...TRUNCATED)
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"907","repository_url":"https://github.com/psf/black","file_path":"/src/black/lines.py","(...TRUNCATED)
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED)
{"repo_id":"1002","repository_url":"https://github.com/matplotlib/matplotlib","file_path":"/lib/matp(...TRUNCATED)
End of preview.

No dataset card yet

Downloads last month
-