Part 2 - Troubleshooting Linux Scenarios π§βπ»

π Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!
SKILLS:
πΉ Languages & Runtimes: Python, Shell Scripting, HCL, YAML πΉ Cloud Technologies: AWS, Microsoft Azure, GCP πΉ Infrastructure Tools: Docker, Terraform, AWS CloudFormation πΉ Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps πΉ Web Development: HTML, CSS, Bootstrap, Python, SQL
Job & Responsibilities:
π Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. π Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. βοΈ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. π Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. π§ Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.
I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.
Let's Connect & Grow:
If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.
Issue 1: Unable to connect to a website or an application
π οΈ Approach / Solution:
βββ Ping the server by Hostname and IP Address
β βββ False: Above Troubleshooting Diagram "Server is not reachable or cannot connect"
β βββ True: Check the service availability by using telnet command with port
β β βββ True: Service is running
β β βββ False: Service is not reachable or running
β β β βββ π Check the service status using systemctl or other command
β β β βββ π Check the firewall/selinux
β β β βββ π Check the service logs
β β β βββ π Check the service configuration
βββ ...
Issue 2: Unable to get IP Address
π οΈ Approach / Solution:
βββ IP Assignment Methods
β βββ DHCP
β β βββ Fixed Allocation
β β βββ Dynamic Allocation
β βββ Static
βββ Troubleshooting
β βββ check network setting from virtualization environment like VMware, VirtualBox, or etc
β βββ check the IP address is assigned or not
β βββ check the NIC status from the host side using #lspci, #nmcli, etc
β βββ restart network service
βββ ...
Issue 3: Server is not reachable or unable to connect
π οΈ Approach / Solution:
βββ Ping the server by Hostname and IP Address
β βββ Hostname/IP Address is pingable
β β βββ Issue might be on the client side as server is reachable
β βββ Hostname is not pingable but IP Address is pingable
β β βββ Could be the DNS issue
β β β βββ π check /etc/hosts
β β β βββ π check /etc/resolv.conf
β β β βββ π check /etc/nsswitch.conf
β β β βββ (Optional) DNS can also be defined in the /etc/sysconfig/network-scripts/ifcfg-<interface>
β βββ Hostname/IP Address both are not pingable
β β βββ Check the other server on its same network to see if there is a Network side access issue or overall something bad
β β β βββ False: Issue is not overall network side but it's with that host/server
β β β βββ True: Might be an overall network side issue
β β βββ Logged into the server by Virtual Console, if the server is Powered ON. Check the uptime
β β βββ Check if the server has the IP and has UP status of Network interface
β β β βββ (Optional) Also check IP-related information from /etc/sysconfig/network-scripts/ifcfg-<interface>
β β βββ Ping the gateway, also check routes
β β βββ Check Selinux, Firewall rules
β β βββ Check physical cable connection
Issue 4: Unable to ssh as root or any other user
π οΈ Approach / Solution:
βββ Ping the server by Hostname and IP Address
β βββ False: Above Troubleshooting Diagram "Server is not reachable or cannot connect"
β βββ True: Check the service availability by using telnet command with port
β β βββ True: Service is running
β β β βββ Issue might be on the client side
β β β βββ User might be disabled, nologin shell, disabled root login, and other configurations
β β βββ False: Service is not reachable or running
β β β βββ π Check the service status using systemctl or other command
β β β βββ π Check the firewall/selinux
β β β βββ π Check the service logs
β β β βββ π Check the service configuration
βββ ...
Issue 5: Disk Space is full issue or add/extend disk space
π οΈ Approach / Solution:
βββ System Performance degradation detection
β βββ Application getting slow/unresponsive
β βββ Commands are not running (For Example: / disk space is full)
β βββ Cannot do logging and other etc
βββ Analyse the issue
β βββ df command to find the problematic filesystem space issue
βββ Action
β βββ After finding the specific filesystem, use du command in that filesystem to get which files/directories are large
β βββ Compress/remove big files
β βββ Move the items to another partition/server
β βββ Check the health status of the disks using badblocks command (For Example: #badblocks -v /dev/sda)
β βββ Check which process is IO Bound (using iostat)
β βββ Create a link to file/dir
βββ New disk addition
β βββ Simple partition
β β βββ Add disk to VM
β β βββ Check the new disk with df/lsblk command
β β βββ fdisk to create partition. Better to have LVM partition
β β βββ Create filesystem and mount it
β β βββ fstab entry for persistence
β βββ LVM Partition
β β βββ Add disk to VM
β β βββ Check the new disk with df/lsblk command
β β βββ fdisk to create LVM partition
β β βββ PV, VG, LV
β β βββ Create filesystem and mount it
β β βββ fstab entry for persistence
β βββ Extend LVM partition
β β βββ Add disk, and create LVM partition
β β βββ Add LVM partition (PV) in the existing VG
β β βββ Extend LV and resize filesystem
βββ ...
Issue 6: SSL/TLS Certificate Expiry
π οΈ Approach / Solution:
βββ Check SSL/TLS certificate expiration date
βββ Confirm the correct certificate is being used
βββ Renew the SSL/TLS certificate before expiry
βββ Verify that the server's system time is accurate
βββ Restart the web server after certificate renewal
βββ Ensure the renewed certificate is properly configured
βββ Check for any errors in the web server logs
βββ Consider automating certificate renewal tasks
βββ Monitor SSL/TLS certificate expiry using alerts
βββ ...
Issue 7: Database Connection Issue
π οΈ Approach / Solution:
βββ Check if the database service is running
βββ Verify database connection parameters (username, password)
βββ Test database connectivity using command-line tools
βββ Inspect database logs for connection errors
βββ Review firewall rules for database port access
βββ Confirm network accessibility between application and database servers
βββ Check for any recent changes in the database configuration
βββ Monitor database resource usage and performance
βββ Investigate potential database server load issues
βββ ...
Issue 8: Web Application 404 Error
π οΈ Approach / Solution:
βββ Verify that the requested URL is correct
βββ Check web server logs for 404 error details
βββ Confirm that the file or resource exists on the server
βββ Review web server configuration for correct document root
βββ Inspect file and directory permissions
βββ Clear web browser cache and retry
βββ Consider URL rewriting rules if applicable
βββ Test with different browsers or devices
βββ Investigate if there are any recent website changes
βββ ...
Issue 9: Slow Application Response Time
π οΈ Approach / Solution:
βββ Identify bottlenecks using 'top' or 'htop'
βββ Monitor application-specific logs for errors or warnings
βββ Check for database connection and query performance
βββ Review application code for inefficient algorithms
βββ Optimize database queries and indexing
βββ Investigate potential issues with external API calls
βββ Monitor network latency between application components
βββ Consider implementing caching mechanisms
βββ Optimize server and network configurations
βββ ...
Issue 10: High CPU Usage
π οΈ Approach / Solution:
βββ Identify the process causing high CPU usage using top or htop
βββ Check if the issue is intermittent or continuous
βββ Review logs for any error messages or known issues
βββ Inspect running processes and their resource consumption
βββ Investigate potential malware or unauthorized processes
βββ Consider optimizing or scaling the application
βββ Monitor system metrics over time to identify patterns
βββ Apply performance tuning based on the specific application
βββ ...




