You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using:
var stringJson = JArray.FromObject(deserialized_jsons).ToString();
using (var r = ChoJSONReader.LoadText(stringJson).ErrorMode(ChoErrorMode.IgnoreAndContinue))
{
using (var w = new ChoParquetWriter(stream, new ChoParquetRecordConfiguration { CompressionMethod = Parquet.CompressionMethod.Snappy})
.ThrowAndStopOnMissingField(false)
.ErrorMode(ChoErrorMode.IgnoreAndContinue))
{
w.Write(r);
}
}
Can I have it be represented as:
stype string
decorators array<struct<Stype:string,FiscalInformation:struct<Stype:string,UUID:string,CFDIUse:string, etc...
InstanceID string
company string
instead of
type string
decorators_0_stype string
decorators_0_fiscalinformation_stype string
decorators_0_fiscalinformation_uuid string, etc...
I don't want a column for each property of each nested array, all of them separated by numbers. I want one column that contains all the elements of the nested array.
Is there a way to have the column be an array for search purposes? (e.g. when using Amazon Athena to query the file as a parquet file)
If I generate the parquet file with AWS Glue, it gives me the column as array
The text was updated successfully, but these errors were encountered:
Trying to convert JSON to Parquet
Sample Json:
{
"Stype":"BaseDecorator",
"Decorators":[
{"Stype":"FiscalInformationDecorator","FiscalInformation":{"Stype":"FiscalInformation","UUID":"02d0c973-727e-449e-bb4e-45dddbd7dbeb", etc...}},
{"Stype":"DocumentInformationDecorator","DocumentInformation":{"Stype":"DocumentInformation","DocumentModelID":"7ec7b1d4-f94f-42b5-ba36-77701cdf1db4", etc...}},
{"Stype":"IssuingInformationDecorator","IssuingInformation":{"Stype":"IssuingInformation","RFC":"PRR890126QC2", etc...}}
],
"InstanceID":"78091f6e-e458-4a23-abfe-fe286b24b59a",
"company":"d6038f2d-787c-427b-8eaf-4d9eea44a24a"
}
Decorators is an array
Using:
var stringJson = JArray.FromObject(deserialized_jsons).ToString();
using (var r = ChoJSONReader.LoadText(stringJson).ErrorMode(ChoErrorMode.IgnoreAndContinue))
{
using (var w = new ChoParquetWriter(stream, new ChoParquetRecordConfiguration { CompressionMethod = Parquet.CompressionMethod.Snappy})
.ThrowAndStopOnMissingField(false)
.ErrorMode(ChoErrorMode.IgnoreAndContinue))
{
w.Write(r);
}
}
Can I have it be represented as:
stype string
decorators array<struct<Stype:string,FiscalInformation:struct<Stype:string,UUID:string,CFDIUse:string, etc...
InstanceID string
company string
instead of
type string
decorators_0_stype string
decorators_0_fiscalinformation_stype string
decorators_0_fiscalinformation_uuid string, etc...
I don't want a column for each property of each nested array, all of them separated by numbers. I want one column that contains all the elements of the nested array.
Is there a way to have the column be an array for search purposes? (e.g. when using Amazon Athena to query the file as a parquet file)
If I generate the parquet file with AWS Glue, it gives me the column as array
The text was updated successfully, but these errors were encountered: