Industry Technology Inc.

Company

Key Value
Company Industry Technology Inc.
Employee 30+
Founded 2018
Web Site https://industrytechnology.co/
Description A proptech company delivering enterprise applications for Japan’s largest real estate companies
Location Tokyo, Japan

Team

Key Value
Title CTO / Software Engineer (Backend)
Mission As CTO, I led a development team of 8 members in creating enterprise applications. I was also hands-on, personally writing over 100,000 lines of backend code from scratch.
Task
  • Feature Development: I was responsible for the development of new product features.
  • SRE Practice Introduction: I introduced and implemented Site Reliability Engineering (SRE) practices within the team.
Term April 1, 2019 - September 30, 2024
Team Size 8
Type Permanent

Projects

Key Value
Summary IT1. I designed Service Level Indicators (SLIs) and Objectives (SLOs), and developed custom metrics and tracing to significantly enhance system reliability.
Problem
  • Major Business Opportunity: We had an opportunity to introduce our system to one of Japan's largest real estate companies.
  • Operational Readiness Gap: I realized that our existing operational capabilities did not meet the reliability and performance levels required by this potential enterprise client.
Mission Achieve Enterprise-Grade Reliability: My goal was to elevate system reliability to meet the stringent operational standards expected by a major enterprise customer.
Action
  • Designed SLIs/SLOs: I designed SLIs and SLOs based on Critical User Journeys (CUJs) to define and measure reliability.
  • Developed Observability Features:
    • I implemented custom metrics to track key performance indicators.
    • I integrated distributed tracing to provide insights into request flows.
Challenge Balancing SLOs and Release Velocity: A key challenge was tuning the SLOs appropriately, as violating an SLO would necessitate halting releases, potentially impacting development agility.
Overcome Conducted Load Testing and Iterative Optimization: I performed comprehensive load tests to observe SLI behavior under traffic spikes and used this data to set realistic and achievable SLOs, iteratively optimizing as needed.
Result Secured Major Enterprise Contract: The improved system reliability and demonstrated operational maturity strengthened customer trust, directly leading to securing a major enterprise contract.
Skill SLI/SLO / Observability / Monitoring / Load Test
Key Value
Summary IT2. I developed a group chat service and a notification system, enabling users to exchange photos, videos, and text messages.
Problem Missing Essential Chat Functionality: The system lacked a required chat function, which was crucial for daily business operations and user collaboration.
Mission Deliver Robust Chat and Notification Capabilities: My objective was to develop a comprehensive group chat service and an efficient notification system to meet business needs.
Action
  • Developed Group Chat Feature:
    • I implemented secure media sharing (photos, videos) using S3 with Signed URLs.
    • I integrated a Content Delivery Network (CDN) for optimized and fast content delivery.
  • Developed Notification System:
    • I built an event-driven architecture to ensure efficient and timely notifications.
Challenge Meeting Tight Development Schedule: The primary constraint was a very tight schedule, which did not allow sufficient time to build a more complex WebSocket-based real-time solution.
Overcome Adopted Pragmatic Polling-Based Approach: To meet the deadline, I extended the existing API using a polling mechanism for near real-time updates, instead of implementing a WebSocket server.
Result Improved User Engagement and Productivity by 25%: The new system provided a seamless chat and notification experience, resulting in a 25% improvement in user engagement and productivity.
Skill Feature Development
Key Value
Summary IT3. I tuned database indexes and optimized queries, significantly improving overall database performance.
Problem
  • Frequent Slow Queries Impacting UX: A core feature of the application suffered from repeated slow queries, negatively impacting user experience.
  • Resource Spikes During Peak Hours: These slow queries led to high CPU and memory utilization on the database, especially during peak usage times (e.g., 09:00 AM).
Mission Diagnose and Resolve Performance Bottlenecks: My goals were to diagnose the root causes of the performance bottlenecks and optimize query execution for improved stability and speed.
Action
  • Optimized Database Indexes: I analyzed query patterns and implemented appropriate indexes to accelerate data retrieval.
  • Refactored Inefficient Queries:
    • I identified and eliminated unnecessary table joins.
    • I rewrote queries to ensure they effectively utilized the newly created and existing indexes.
Challenge Identifying Root Cause of Complex Slow Queries: Pinpointing the exact source of slow queries was challenging due to complex data models and intricate query execution patterns.
Overcome
  • Utilized Query Tracing: I monitored and analyzed distributed tracing data to detect anomalies and identify problematic query paths.
  • Performed EXPLAIN Analysis: I used EXPLAIN statements to analyze query execution plans, specifically looking for full table scans and inefficient index usage.
Result Reduced Slow Queries by 70% and Stabilized Performance: These optimizations led to a 70% reduction in slow query occurrences and stabilized system performance, especially during peak hours.
Skill Performance Optimization
Key Value
Summary IT4. I led the migration from a modular monolith to a microservices architecture to enhance system scalability.
Problem
  • Scalability Limits of Monolithic Architecture: The system was initially built as a modular monolith, which began to show scalability limitations.
  • Increased Database Load on Core Features: As user traffic to core features grew, the write load on the central database significantly increased, impacting performance.
Mission Improve Scalability through Microservices: My objective was to create independent microservices from the primary monolithic service to distribute the load and improve overall system scalability.
Action
  • Executed Phased Microservice Migration: I migrated the system to microservices incrementally to minimize risk and ensure stability.
  • Created Key Microservices: I designed and implemented three initial microservices for distinct domains: Work Diary, Attendance Management, and a core Main service.
Challenge Minimizing Downtime During Migration: A critical requirement was to perform the migration with minimal or zero downtime to avoid disrupting users.
Overcome Implemented BFF for Gradual Traffic Switching: I introduced a Backend-for-Frontend (BFF) layer in front of the new microservices. This allowed us to gradually switch traffic from the monolith to each new service, enabling a seamless and controlled migration.
Result Reduced Database Load and Improved Scalability: The migration successfully distributed traffic for core features across independent microservices, significantly reducing the write load on the main database and improving overall system scalability.
Skill Migration
Key Value
Summary IT5. I managed the migration from MySQL 5.7 to MySQL 8.0 to ensure long-term system reliability and maintainability.
Problem
  • Approaching End-of-Life for MySQL 5.7: The existing MySQL 5.7 database version was nearing its end-of-life, necessitating an upgrade to a supported version.
  • Inherent Migration Complexity and Risks: Database migrations of this scale are inherently complex and carry potential risks of data loss or extended downtime if not managed carefully.
Mission Execute Seamless Upgrade to MySQL 8.0: My mission was to seamlessly upgrade the database to MySQL 8.0 while ensuring system stability, data integrity, and optimal performance.
Action
  • Developed Detailed Runbook: I created a comprehensive, step-by-step migration plan (runbook) to ensure a predictable and controlled process.
  • Implemented Blue-Green Deployment Strategy: I utilized a blue-green deployment strategy for the database, which allowed for a smooth transition, minimized downtime, and provided a clear rollback path.
Challenge Unexpected Incident During Migration: Despite careful planning, unexpected issues surfaced post-migration, impacting user experience on certain service pages.
Overcome
  • Rapid Investigation & Root Cause Analysis:
    • Users reported encountering an error dialog on specific service pages after the migration.
    • I quickly determined the issue was a JSON unmarshal error caused by an unintended data type change in a column (from UNSIGNED INT to DOUBLE during the migration process for certain tables).
  • Immediate Temporary Fix: I deployed a hotfix to gracefully handle the JSON parsing errors, restoring service availability.
  • Definitive Permanent Fix: I then reverted the affected JSON column types back to their original UNSIGNED INT definition, fully resolving the compatibility issues and ensuring data integrity.
Result Increased Query Performance by 20%: The migration to MySQL 8.0, once stabilized, resulted in a 20% increase in overall query performance.
Skill Incident Response / Migration

Technology

Value Tag
Go Backend
gRPC Backend
MySQL Backend
PostgreSQL Backend
Redis Backend
AWS Infrastructure
Google Cloud Infrastructure
Terraform Infrastructure
Datadog Monitoring
Prometheus Monitoring
Next.js Frontend
React Frontend
TypeScript Frontend