Skip to content

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Jan 6, 2025

Which issue(s) this PR fixes:
Fixes #

What this PR does / why we need it:
When I read 10 GB file using in_tail plugin, it calls Yajl methods many times when it stores to buffer file.

Recently, Ruby's JSON has huge improvement and it is much faster.
Ref. https://byroot.github.io/ruby/json/2024/12/15/optimizing-ruby-json-part-1.html

  • Before

    • It spent 88.533409329 sec to handle 10 GB file
  • After

    • It spent 72.324643557 sec to handle 10 GB file
  • config

<source>
  @type tail
  path /home/watson/prj/sandbox/fluentd/log/access*.log
  pos_file /home/watson/prj/sandbox/fluentd/log/access.log.pos
  tag log
  read_from_head true
  follow_inodes true
  # rotate_wait 0
  refresh_interval 5
  # open_on_every_update true
  <parse>
    @type none
  </parse>
</source>

<match **>
  @type file
  path /home/watson/prj/sandbox/fluentd/log/log
</match>

Docs Changes:

Release Note:

@Watson1978 Watson1978 marked this pull request as ready for review January 7, 2025 00:51
@daipom daipom added this to the v1.19.0 milestone Jan 7, 2025
Copy link
Contributor

@daipom daipom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@daipom daipom merged commit 2df84a9 into fluent:master Jan 9, 2025
13 checks passed
@Watson1978 Watson1978 deleted the yajl-ruby branch January 10, 2025 00:35
@kenhys
Copy link
Contributor

kenhys commented Feb 3, 2025

NOTE: will not be backported to v1.16

daipom pushed a commit that referenced this pull request Feb 3, 2025
…nce (#4813)

**Which issue(s) this PR fixes**: 
Fixes #

**What this PR does / why we need it**: 
Recently, Ruby's json has incredible performance improvements.
It might be faster than oj gem.
So, I think json is a suitable as fallback.

This is similar with #4759

Here is easily benchmark. (I used same log file in #4759)

* Before
  * It spent 90.50963662 sec to handle 10 GB file 
* After
  * It spent 74.624230077 sec to handle 10 GB file

* config
```
<source>
  @type tail
  path "#{File.expand_path '~/tmp/access*.log'}"
  pos_file "#{File.expand_path '~/tmp/fluentd/access.log.pos'}"
  tag log
  read_from_head true
  <parse>
    @type json
  </parse>
</source>

<match **>
  @type file
  path "#{File.expand_path '~/tmp/fluentd/log'}"
</match>
```

FYI)
*
https://byroot.github.io/ruby/json/2024/12/15/optimizing-ruby-json-part-1.html
*
https://byroot.github.io/ruby/json/2024/12/18/optimizing-ruby-json-part-2.html
*
https://byroot.github.io/ruby/json/2024/12/27/optimizing-ruby-json-part-3.html
*
https://byroot.github.io/ruby/json/2024/12/29/optimizing-ruby-json-part-4.html
*
https://byroot.github.io/ruby/json/2025/01/04/optimizing-ruby-json-part-5.html
*
https://byroot.github.io/ruby/json/2025/01/12/optimizing-ruby-json-part-6.html
*
https://byroot.github.io/ruby/json/2025/01/14/optimizing-ruby-json-part-7.html



**Docs Changes**:
fluent/fluentd-docs-gitbook#560

**Release Note**:

Signed-off-by: Shizuo Fujita <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants