Dataset Preview
The full dataset viewer is not available (click to read why). Only showing a preview of the rows.
The dataset generation failed
Error code: DatasetGenerationError
Exception: TypeError
Message: Couldn't cast array of type
struct<repo_id: string, repository_url: string, file_path: string, line_num: int64, func_num: int64, category: string, quality_rating: string, description: string, stars_estimate: string, notes: string, sample_id: int64, group_type: string, group_size: int64, coupling: double, group_score: double, group_difficulty: double, group_overall_score: double, group_coherence_score: double, func_names: list<item: string>, file_lines: int64, graph_stats: struct<total_functions: int64, call_edges: int64, sibling_pairs: int64>, func_name: string, start_line: int64, end_line: int64, loc: int64, complexity: double, inferability: double, fim_score: double, difficulty: double, overall_score: double, is_feasible: bool, correctness: int64, executability: int64, api_usage: int64, readability: int64, completeness: int64>
to
{'repo_id': Value('string'), 'repository_url': Value('string'), 'file_path': Value('string'), 'line_num': Value('int64'), 'func_num': Value('int64'), 'category': Value('string'), 'quality_rating': Value('string'), 'description': Value('string'), 'stars_estimate': Value('string'), 'notes': Value('string'), 'sample_id': Value('int64'), 'group_type': Value('string'), 'group_size': Value('int64'), 'coupling': Value('float64'), 'group_score': Value('float64'), 'group_difficulty': Value('float64'), 'group_overall_score': Value('float64'), 'group_coherence_score': Value('float64'), 'func_names': List(Value('string'))}
Traceback: Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1779, in _prepare_split_single
for key, table in generator:
^^^^^^^^^
File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 609, in wrapped
for item in generator(*args, **kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/json/json.py", line 295, in _generate_tables
self._cast_table(pa_table, json_field_paths=json_field_paths),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/packaged_modules/json/json.py", line 128, in _cast_table
pa_table = table_cast(pa_table, self.info.features.arrow_schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2281, in table_cast
return cast_table_to_schema(table, schema)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2233, in cast_table_to_schema
cast_array_to_feature(
File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 1804, in wrapper
return pa.chunked_array([func(chunk, *args, **kwargs) for chunk in array.chunks])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/table.py", line 2101, in cast_array_to_feature
raise TypeError(f"Couldn't cast array of type\n{_short_str(array.type)}\nto\n{_short_str(feature)}")
TypeError: Couldn't cast array of type
struct<repo_id: string, repository_url: string, file_path: string, line_num: int64, func_num: int64, category: string, quality_rating: string, description: string, stars_estimate: string, notes: string, sample_id: int64, group_type: string, group_size: int64, coupling: double, group_score: double, group_difficulty: double, group_overall_score: double, group_coherence_score: double, func_names: list<item: string>, file_lines: int64, graph_stats: struct<total_functions: int64, call_edges: int64, sibling_pairs: int64>, func_name: string, start_line: int64, end_line: int64, loc: int64, complexity: double, inferability: double, fim_score: double, difficulty: double, overall_score: double, is_feasible: bool, correctness: int64, executability: int64, api_usage: int64, readability: int64, completeness: int64>
to
{'repo_id': Value('string'), 'repository_url': Value('string'), 'file_path': Value('string'), 'line_num': Value('int64'), 'func_num': Value('int64'), 'category': Value('string'), 'quality_rating': Value('string'), 'description': Value('string'), 'stars_estimate': Value('string'), 'notes': Value('string'), 'sample_id': Value('int64'), 'group_type': Value('string'), 'group_size': Value('int64'), 'coupling': Value('float64'), 'group_score': Value('float64'), 'group_difficulty': Value('float64'), 'group_overall_score': Value('float64'), 'group_coherence_score': Value('float64'), 'func_names': List(Value('string'))}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 1342, in compute_config_parquet_and_info_response
parquet_operations, partial, estimated_dataset_info = stream_convert_to_parquet(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 907, in stream_convert_to_parquet
builder._prepare_split(split_generator=splits_generators[split], file_format="parquet")
File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1646, in _prepare_split
for job_id, done, content in self._prepare_split_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/datasets/builder.py", line 1832, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the datasetNeed help to make the dataset viewer work? Make sure to review how to configure the dataset viewer, and open a discussion for direct support.
messages list | metadata dict |
|---|---|
[
{
"role": "user",
"content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ... | {
"repo_id": "578",
"repository_url": "https://github.com/tensorflow/tensorflow",
"file_path": "/tensorflow/python/autograph/pyct/static_analysis/type_inference_test.py",
"line_num": 942,
"func_num": 145,
"category": "Category 1: From Scratch",
"quality_rating": "extra-high",
"description": "TensorFlow"... |
[
{
"role": "user",
"content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ... | {
"repo_id": "1003",
"repository_url": "https://github.com/pytest-dev/pytest",
"file_path": "/testing/test_debugging.py",
"line_num": 1436,
"func_num": 74,
"category": "Testing Framework",
"quality_rating": "extra-high",
"description": "The pytest framework makes it easy to write small tests, yet scales... |
[
{
"role": "user",
"content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ... | {
"repo_id": "181",
"repository_url": "https://github.com/pyqtgraph/pyqtgraph",
"file_path": "/pyqtgraph/graphicsItems/ScatterPlotItem.py",
"line_num": 1241,
"func_num": 75,
"category": "Category 6: Visualization",
"quality_rating": "high",
"description": "Scientific graphics",
"stars_estimate": "4000... |
[
{
"role": "user",
"content": "You are an expert Python programmer. Your task is to complete 3 masked functions based on the surrounding code context.\n\n## Task\n\nBelow is a Python file where 3 function bodies have been replaced with `# <MASKED_FUNCTION_BODY>`. These functions are **structurally related** ... | {
"repo_id": "1005",
"repository_url": "https://github.com/astropy/astropy",
"file_path": "/astropy/nddata/covariance.py",
"line_num": 1063,
"func_num": 42,
"category": "Scientific Computing",
"quality_rating": "high",
"description": "A community Python library for Astronomy.",
"stars_estimate": "4500... |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"198","repository_url":"https://github.com/RustPython/RustPython","file_path":"/Lib/bdb.p(...TRUNCATED) |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"82","repository_url":"https://github.com/Rapptz/discord.py","file_path":"/discord/emoji.(...TRUNCATED) |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"1009","repository_url":"https://github.com/pylint-dev/pylint","file_path":"/tests/test_s(...TRUNCATED) |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"771","repository_url":"https://github.com/apache/spark","file_path":"/python/pyspark/sql(...TRUNCATED) |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"907","repository_url":"https://github.com/psf/black","file_path":"/src/black/lines.py","(...TRUNCATED) |
[{"role":"user","content":"You are an expert Python programmer. Your task is to complete 3 masked fu(...TRUNCATED) | {"repo_id":"1002","repository_url":"https://github.com/matplotlib/matplotlib","file_path":"/lib/matp(...TRUNCATED) |
End of preview.
No dataset card yet
- Downloads last month
- -