G
Guest
The problem is how to achieve the transformation as below:
The source xml contains tons of repeating structure like below, each item
node contains a person element and a insurance element that correlate to the
Person element with the person id.
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla1â€>
</item>
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p456†detail=â€blabla2â€>
</item>
<Item>
<Person id=â€p456†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla3â€>
</item>
The goal is to regroup to a structure of 1(Person) to many(Insurance), like
below
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla1â€>
<Insurance ref=â€p123†detail=â€blabla3â€>
</Item>
My initial idea was to load the source into memory and dissect into
Hashtables so that I could easily regroup. However, since the source file is
really big (approximate 50M each with 70000 repeating items), obviously my
way of doing it is too memory consuming. I am frustrated, after a whole day
sitting quietly and cannot figure out a better way, I would really appreciate
any help.
Thanks in advance
The source xml contains tons of repeating structure like below, each item
node contains a person element and a insurance element that correlate to the
Person element with the person id.
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla1â€>
</item>
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p456†detail=â€blabla2â€>
</item>
<Item>
<Person id=â€p456†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla3â€>
</item>
The goal is to regroup to a structure of 1(Person) to many(Insurance), like
below
<Item>
<Person id=â€p123†name=â€someone1â€>
<Insurance ref=â€p123†detail=â€blabla1â€>
<Insurance ref=â€p123†detail=â€blabla3â€>
</Item>
My initial idea was to load the source into memory and dissect into
Hashtables so that I could easily regroup. However, since the source file is
really big (approximate 50M each with 70000 repeating items), obviously my
way of doing it is too memory consuming. I am frustrated, after a whole day
sitting quietly and cannot figure out a better way, I would really appreciate
any help.
Thanks in advance