Author |
Message
|
syedimtiyaz1234 |
Posted: Fri Nov 06, 2020 9:51 am Post subject: compressing data before sending to Trace node |
|
|
Apprentice
Joined: 22 Jan 2018 Posts: 30
|
Hey,
Currently we are logging base64 encoded string in Trace file. but data size is the concern.
And I am trying to find a solution to reduce the size it logs and developer can easily convert the data back to original.
If any one has suggestion , please let me know.
Thanks in advance. |
|
Back to top |
|
|
gbaddeley |
Posted: Sun Nov 08, 2020 11:35 am Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
Can you not base64 encode the string? It will then 75% of the base64 encoded size. _________________ Glenn |
|
Back to top |
|
|
syedimtiyaz1234 |
Posted: Mon Nov 09, 2020 2:13 pm Post subject: |
|
|
Apprentice
Joined: 22 Jan 2018 Posts: 30
|
if we log plain simply as it is , then data is not formatted , the data could be json/xml or plain string so it does not looks good.
when we do encode it the logging format is good i.e. we can follow generic logging format |
|
Back to top |
|
|
timber |
Posted: Mon Nov 09, 2020 4:02 pm Post subject: |
|
|
Grand Master
Joined: 25 Aug 2015 Posts: 1290
|
You need to decide whether you want readability or compression. You cannot have both. If this log format is for developers, you should be able to think up other solutions (e.g. log4j) that allow more flexibility. The base64 encoding is extremely easy to decode using any scripting language or Java. |
|
Back to top |
|
|
syedimtiyaz1234 |
Posted: Tue Nov 10, 2020 10:36 am Post subject: |
|
|
Apprentice
Joined: 22 Jan 2018 Posts: 30
|
I am looking for both readability and compression as well as.
We are using IIB Monitoring events to receive the data that we need to log so even we use log4j , at the end storing data will be same right or am I missing anything.
so we don't have any way to reduce the data size when we log to file.
lets consider we want to log plain xml data to file , is there any way we can compress this before logging to file. |
|
Back to top |
|
|
gbaddeley |
Posted: Tue Nov 10, 2020 3:13 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
Note that base64 does not compress data. It actually makes it longer....
To reduce the size, you could try squeezing out all the white space.
A logging frame work should able to cope with large messages and have room to scale. Disk space is cheap. I am not 100% sure of your concern. _________________ Glenn |
|
Back to top |
|
|
syedimtiyaz1234 |
Posted: Fri Nov 13, 2020 12:50 pm Post subject: |
|
|
Apprentice
Joined: 22 Jan 2018 Posts: 30
|
Yes base64 does not compress the data.
if we log the original data then my concern is readability and that's why I thought of doing base64 encoding so that at-least it solves my readability concern and so i am kind of looking if there is other way where i can maintain my readability and along with that it takes less data size than base64 encoded string.
For me Disk space is an issue and I agree we should have room for scale but I am trying to see if I can do something about this from reducing size perspective. |
|
Back to top |
|
|
fjb_saper |
Posted: Fri Nov 13, 2020 3:24 pm Post subject: |
|
|
Grand High Poobah
Joined: 18 Nov 2003 Posts: 20729 Location: LI,NY
|
Have you thought about encrypting the data with a key? In the long run you might have less data to ship across when encrypted than when using base64...
You could also look at compression on the fly? Have the agent compress the data before sending it to your backend... _________________ MQ & Broker admin |
|
Back to top |
|
|
gbaddeley |
Posted: Sun Nov 15, 2020 1:44 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
syedimtiyaz1234 wrote: |
Yes base64 does not compress the data.
if we log the original data then my concern is readability and that's why I thought of doing base64 encoding so that at-least it solves my readability concern and so i am kind of looking if there is other way where i can maintain my readability and along with that it takes less data size than base64 encoded string.
For me Disk space is an issue and I agree we should have room for scale but I am trying to see if I can do something about this from reducing size perspective. |
Have you looked at RLE, LZ or ZIP compression algorithms?
I'm not sure what you mean by readability. Does this mean storing a shortened form of the data, but being able to easily read the original data? _________________ Glenn |
|
Back to top |
|
|
timber |
Posted: Mon Nov 16, 2020 8:27 am Post subject: |
|
|
Grand Master
Joined: 25 Aug 2015 Posts: 1290
|
Just to be clear
- You cannot safely put raw bytes into a text file, so logging the 'raw data' or a compressed version of it is not advisable.
- The simple (naive) solution is to encode each byte value using two characters (so 0x61 becomes "3631" which occupies two bytes (in a single-byte character set or UTF-.
- A better solution is to encode the bytes using base64 encoding. This uses 4 characters for every 3 bytes, so the text data is 33% larger than the raw data. Yes, it is a little larger than the 'raw' data - but it is also safe for all text formats.
You can apply base64 encoding to any sequence of bytes, so you can compress using any of the popular compression algorithms and then base64-encode the resulting byte stream. It will be harder to read, but if your main objective is to reduce disk space... |
|
Back to top |
|
|
gbaddeley |
Posted: Mon Nov 16, 2020 2:27 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
syedimtiyaz1234 wrote: |
if we log plain simply as it is , then data is not formatted , the data could be json/xml or plain string so it does not looks good.
when we do encode it the logging format is good i.e. we can follow generic logging format |
Can you clarify what you mean by 'does not look good' ?
Are you looking for a generic logging format that converts JSON, XML or other string data into a common format? eg. Convert everything to JSON, and using only printable characters.
Our IIB flows process many different formats of data, and logs them as-is. This leaves no doubt about the content or format of the original data. There is a learning curve for being able to visually read XML, JSON, CSV, fixed columns, etc. but its not very steep. _________________ Glenn |
|
Back to top |
|
|
|