Videos

  • Add Videos
  • View All

Latest Activity

Profile IconWilliam S and Please... Dee Esssss :-) joined splunkninja
1 hour ago
Amine Recoba is now a member of splunkninja
yesterday
Michael Wilde replied to Nikita's discussion Count failures and success via transaction
"How are these transactions linked together... by a field called "ID"?  If so.. just build them with the field ID, and then use one of the MV commands to extract a field with success or failure in it.   Paste some samples and…"
Friday
Linus Myrefelt updated their profile
May 22
Marie updated their profile
May 21
Marie is now a member of splunkninja
May 21
Profile IconJitter and matthew arguin joined splunkninja
May 18
Profile IconMatthew Carter and Nikita joined splunkninja
May 17
Hopefully, this will be first of many discussions I'll be part of. Found SN yesterday, very cool.

I've got a single host handling both Splunk indexing and searching. I'd like to give searching priority over indexing. Ordinarily, being a UNIX hack I would nice +19 the splunk processes parent process.

root 3492 11.7 12.7 341212 263652 ? SNl Nov09 335:00 splunkd -p 8080 restart
root 3493 0.0 0.0 17916 1728 ? SNs Nov09 0:28 splunkd -p 8080 restart
root 3543 0.9 2.2 187340 46368 ? SNl Nov09 26:25 python -O /opt/splunk/lib/python2.6/site-packages/splunk/appserver/mrsparkle/root.py restart

I've tried that, and oops, all Splunk processes ended up +19, not just the search processes.

Is there a way to do this through the GUI? Do I need to renice a specific process?

Thanks,

-dave

Views: 123

Reply to This

Replies to This Discussion

A couple of things come to mind...

What's the profile of your machine? CPU/Memory/Disk Space & Speed.

Are you finding that searches are slow? If so, what types of searches are you doing? How many are running simultaneously? Do you have any apps installed (as some of them have scheduled searches to generate summary indexes).

Give me a little more color on your installation, and I can probably give you some insight in to performance.
The machine looks like this: 4 CPU cores, 2GB ram, 70GB disk, about 30GB used. When I profile the machine (top, sar, iostat, vmstat) I see that the splunkd processes (which appears to be the primary indexing agent) will consume up to 400% CPU (all available CPU) when new batches of data arrive (happens every 30 minutes).

Searches get bogged down when the indexing process is running and chewing up a lot of CPU. There are a few dashboards configured, and they kick off concurrent search processes which have to compete with the indexer if all Splunk associated processes are at the same UNIX nice level.

The searches being done are all ad-hoc. I haven't figured out how to understand Summary Indexing (although, this is on my list of things to grok), which I do know will speed things up a bit.

Simultaneous searches sometimes reach the max of 10 concurrent searches allowed by our Splunk install. I suspect this is changable, but haven't figured out where yet.

There are no Apps installed on this box beyond what comes with the basic Splunk 4.0.5 distribution. It has no access to the outside world, and so the app installer/browser just hangs.

While I don't pretend to understand what Splunk is doing under the covers, I am pretty sure that the competition from the indexing process wanting CPU and taking it when there are splunk-search processes running is the primary issue here. The way *I* know how to solve it in UNIX land is adjusting the priorities of the processes via /bin/nice (during startup) or renice (once thigns are already running). If there is a Splunk preferred way to do this, I would be very appreciative for a pointer in how to find it.

Thanks!

-dave
One more question... 32bit or 64bit?
Dave... I deleted your reply on accident... there was a spammer in here.


Yes. Splunk does take advantage of 64bit in a MAJOR WAY!... When indexing on a 32bit machine, splunk can store its "buckets" in a max of 200MB per bucket--meaning every 200MB of compressed data will result in another file. Why does this matter? When splunk is searching and retrieving the raw events, if your events are spread over many buckets, which they will likely be, you'll be opening (unzipping) and closing buckets like mad, because they're only 200MB each.

64bit Splunk stores buckets of up to 10GB per bucket. So for searches, you maybe be only looking in one bucket (for example). The processor's throughput and size makes all the difference. 64 bit machines usually have more cores as well which will result in faster/better performance.
I hear that. I try to live in 64-bit land whenever possible.

When the real production Splunk gear arrives, it'll be 64-bit OS installation, so I'm hoping that I see some improvement in speed over equivalent hardware running a 32-bit OS.

Thanks Mike,

-dave

RSS

© 2012   Created by Michael Wilde.

Badges  |  Report an Issue  |  Terms of Service